Troubles with a Date Field, Null Data and Fielddata

Hello,

I have created an Elasticsearch instanced where we are place log data from a web site that includes information about when people book appointments, and I am having problems with a field that holds the start time of the appointment.

The first problem is, occasionally we are getting data parsed through logstash without a start time and this is causing this error:

org.elasticsearch.index.mapper.MapperParsingException: failed to parse [appointmentDetails.startDateTime]
//snip//
Caused by: java.lang.IllegalArgumentException: Invalid format: ""

I understand that this is because we are passing nothing to the field that is expecting a timestamp, I think that I can put the ignore_malform option in the index so it won't fail. However I am creating a new index on a daily basis so I think I need to create a template, but I can't quite figure out how to edit the existing logstash template, or if that is even advisable. Should I be creating a new template?

The second problem (and I can happily create a separate discussion point if need be), is I am getting Courier Fetch: 5 of 5 shards failed. when searching in Kibana and this is correlating with the following errors in the elasticsearch logs:

org.elasticsearch.transport.RemoteTransportException: [ElasticServer][<IPAddress>:9300][indices:data/read/search[phase/fetch/id]]
Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default. Set fielddata=true on [appointmentDetails.startDateTime] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

I am not sure what is happening here, are the two related?

Any pointers here would be greatly appreciated.

Cheers,

Tim

Yes, these are problems relating to mapping.
The date issue is fixed by updating the mapping with the "ignore_malformed" property.

The Kibana/text field issue can be addressed by setting fielddate:true on the relevant field but I'd question the need to do this first as it may cost a lot of memory. What is this field that you are trying to report on in Kibana? What sorts of values does it contain?
Normally we see Kibana being used on structured (ie. keyword rather than text data) and if your data is really structured rather than free-text there may be a better fix here...

Hi Mark,

Thanks for the reply. For the ignore_malformed property I am think I should update the existing logstash template so the indexes that are created daily will have the ignore_malformed field. I was thinking of adding this after the message_field dynamic_template in the logstash template:

      {
        "appStartDate_field": {
          "path_match": "appointmentDetails.startDateTime",
          "mapping": {
            "norms": false,
            "type": "date"
            "ignore_malformed": true
          },
          "match_mapping_type": "date"
        }
      }

I haven't tested this yet, but thought you might be able to save some problems if I am on the wrong track.

The entries that we are getting from the web site are appointment details, and we utilise a 'fire and forget' method so the users have a fast response time. This means that if there is a problem with booking the appointment the customer is not aware and we need to be notified.

The log entry holds all the data that the customer put into the appointment. We never need to search on the start time, however we need to know what start time they requested so the appointment can be created manually.

I don't know if this helps with anything but this is the mapping in one of the existing indexes for the appointmentdetails:

    "appointmentDetails": {
        "properties": {
          "body": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          },
          "endDateTime": {
            "type": "date"
          },
          "startDateTime": {
            "type": "date"
          },
          "subject": {
            "type": "text",
            "norms": false,
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          }
        }
      },

Cheers,

Tim

The elasticsearch bits of the Logstash mapping looks like it might be the right thing but I'm not a Logstash expert.
As for the text field I'd repeat my question about which text field you are trying to report on in Kibana that gave you the fielddata error?

I am not really trying to report at all. Logstash sends a Zabbix trigger which will alert us if we get a failed appointment. We then go Kibana search for the details and use the results to recreate the appointment in our application.

When I do that search I search on a traceID field which is text, however the error is being reported in elasticsearch on a date field

Sorry if I have missed the point.

Tim

Sorry if I have missed the point.

My bad. I can see from your original error message that the field in question is appointmentDetails.startDateTime is the field in question.
So is this a "nested" object? Do you have multiple values per elasticsearch doc? What version of elasticsearch are you running?

Hi,

We are running Elasticsearch 5.2.2. This is a full example of log entry, where the booking was not succeeded.

[2017-03-28T17:14:12,769][DEBUG][o.e.a.b.TransportShardBulkAction] [ElasticServer] [logstash-wp_prod_logging-2017.03.28][2] failed to execute bulk item (index) index {[logstash-wp_prod_logging-2017.03.28][log][60b1e328e29884ecb328b7c7cf20ff9aa4475c43], source[{"APIResponse":{"traceId":"63e7503d-1916-48ba-9ee6-6a9bee2c9180","displayError":"The request is invalid","data":[{"errorType":"InvalidModel","errorMessage":"Please provide a start date time for the appointment"},{"errorType":"InvalidModel","errorMessage":"Please provide a end date time for the appointment"}],"success":false,"errorMessage":"The request is invalid","statusCode":400},"clientId":"321045","offset":880039,"input_type":"log","appointmentDetails":{"startDateTime":"","subject":"","body":"","endDateTime":""},"referenceType":"0","source":"/var/www/logs/wp-logging/web_all.log","message":"#1490681640|2017-03-28 17:14:00|appointment-service-new-appointment-exceptions|Request:{\"appointmentDetails\":{\"subject\":\"\",\"body\":\"\",\"startDateTime\":\"\",\"endDateTime\":\"\"},\"clientId\":\"321045\",\"referenceId\":\"231142\",\"referenceType\":\"0\",\"obuaId\":\"18257\"} | Response: {\"success\":false,\"statusCode\":400,\"errorMessage\":\"The request is invalid\",\"displayError\":\"The request is invalid\",\"data\":[{\"errorMessage\":\"Please provide a start date time for the appointment\",\"errorType\":\"InvalidModel\"},{\"errorMessage\":\"Please provide a end date time for the appointment\",\"errorType\":\"InvalidModel\"}],\"traceId\":\"63e7503d-1916-48ba-9ee6-6a9bee2c9180\"}","type":"log","referenceId":"231142","tags":["beats_input_codec_plain_applied","parsed","parsed_API_Message","zabbix_sender"],"@timestamp":"2017-03-28T06:14:00.000Z","logName":"appointment-service-new-appointment-exceptions","@version":"1","beat":{"hostname":"mhweb-avmh-web-02","name":"www-prod","version":"5.2.2"},"host":"mhweb-avmh-web-02","EventID":"1490681640","obuaId":"18257"}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [appointmentDetails.startDateTime]

So there are multiple entries per doc, and some docs have more than others especially ones that are successful.

What's confusing to me is the error relating to fielddata being disabled for the appointmentDetails.startDateTime. This should be using docvalues so I think I need to see more of the parent-level stuff trimmed from your example mapping or examples of mappings from the other indices you are querying.

Hi Mark,

I can copy in the whole index from the day we had that error, however it is 500 or so lines, is something in particular that I can get.

Something that I have just realised, and I am not sure if it is related. There are some other types of bookings that we get that do work. They are in the same index, here is the mapping for those fields:

         "appointment": {
            "properties": {
              "body": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                }
              },
              "endDateTime": {
                "type": "date"
              },
              "startDateTime": {
                "type": "date"
              },
              "subject": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                }
              }
            }
          }

As far as I can tell the only difference is the name appointmentcompared to appointmentDetails

Tim

Having a close look at the previous entry it is actually nested in a section called 'lead'. So 'appointementDetails' is in the root of the document whereas appointment is in 'lead' which is in the root of the document.

     "lead": {
        "properties": {
          "additionalInfo": {\\snipped\\ },
          "appointment": {
            "properties": {
              "body": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                }
              },
              "endDateTime": {
                "type": "date"
              },
              "startDateTime": {
                "type": "date"
              },
              "subject": {
                "type": "text",
                "norms": false,
                "fields": {
                  "keyword": {
                    "type": "keyword"
                  }
                }
              }
            }
          },
         "campaign": {\\snipped\\}
         "contactus": {\\snipped\\}
         "LeadId": {\\snipped\\}
         "overideId": {\\snipped\\}
        }
      }

I feel like I am barking up the wrong tree but it is the only difference I can find.

Tim

My assumption is that the error should only occur if you ask for an agg on a field mapped as text with fiielddata left as the default 'false' so you'll need to make sure that none of your queried indices have that mapping. Fields mapped as date should be aggregatable so - unless there's something funky bug going on with nested or similar

I have found that I am only getting the courier fetch and Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default on when I do a search against three particular indexes.

How do I tell if I have index corruption?

Check the mapping on the index for the fields you are querying. If they are text (analyzed strings) and you are trying to do aggs it will complain that this requires permissions to load this large amount of data into RAM aka "fielddata"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.