Mapper Parsing Exception for field with date data that I want to index as a string

In some of my logs, I get a message body that is just a large json object. In some of those json objects, I have a field coming in that has data in the format of YYYY/MM/DD
E.g.

"param_TRANSACTION_STRINGTEST"=>"2017/01/27"

The json filter seems to be parsing it just fine.
This is not the field I am using for the date of the log
As such, I don't really care whether or not its indexed as a date type. However Elasticsearch seems to be determined to index it as a date, and since apparently 'YYYY/MM/DD' is an invalid date format, I get the Mapper Parsing Exception:

"error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [param_TRANSACTION_DATE]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2017/01/27\" is malformed at \"/01/27\""}}}},

Currently the only way I've been able to solve this is by going and adding an explicit date conversion block on this field - forcing it to become a date, and telling it what format it is.

However, this is impractical as the fields that I'm receiving in the JSON object are not a static set of fields, I could randomly start getting a new field with a new name, with another date-like string in it.

I can't figure out what is causing Elasticsearch to be so dead-set on reading this in as a date however, so I don't know how to change the default behavior.

Just to clarify - there are a few other threads out there I've found where the default response is 'use a date filter to convert it to a date', and yes I understand that is a solution, but please keep in mind that's not my issue here - this isn't my primary date field. I want this to be a string. In fact, I'd prefer any date fields that come through in the JSON object to be strings and then later, if I want to use them as date objects, that is when I would want to go back and make a one-off case of converting it to a date.

Just some update info:

I've been doing some basic testing on inserting data, and it looks like Elasticsearch is saying 'that looks like a date, so I'm going to treat it like a date'. Which is all fine and good. So...
Where is that happening? Is that in the code? Is it in a config?

I've been playing with the idea/philosophy of "It's a date, so why not treat it like one", which is probably the same idea behind why Elasticsearch is doing what it's doing. So another follow-up question would be:
Is there a way that I can dynamically handle these occurrences? I'm trying to stay as far away as I can from micro-managing these fields as they come in. I think it's surprising me that instead of defaulting to index the field as a string, Elasticsearch is instead rejecting the field and causing an index error.

Which version of elasticsearch is it? How does the mapping for this field look like?

ES Version is 2.3.4
I did a GET request for _mapping/logs and originally there didn't seem to be any mapping for the specific field 'param_TRANSACTION_DATE' in that listing.

I also couldn't really find anything in the dynamic section..

Since then I did a few tests using a mutate on the string to remove the forward slashes, as well as doing a date conversion in Logstash using the date{} block, and both of those allowed it to be successfully indexed.

My next test was to create a brand new field in Logstash and test that way.

    add_field => { "testField" => "Test/Data" }
    add_field => { "testField2" => "2016/01/02" }

I tried both of these individually, and testField worked fine without any complaint from ES.
When I tried the 2nd - testField2, it errored again. This is a brand new field name and does not exist in the mappings even now.

Here is the dynamic section of my mapping:

"dynamic_templates": [
               {
                  "message_field": {
                     "mapping": {
                        "fielddata": {
                           "format": "disabled"
                        },
                        "index": "analyzed",
                        "omit_norms": true,
                        "type": "string"
                     },
                     "match": "message",
                     "match_mapping_type": "string"
                  }
               },
               {
                  "string_fields": {
                     "mapping": {
                        "fielddata": {
                           "format": "disabled"
                        },
                        "index": "analyzed",
                        "omit_norms": true,
                        "type": "string",
                        "fields": {
                           "raw": {
                              "ignore_above": 256,
                              "index": "not_analyzed",
                              "type": "string",
                              "doc_values": true
                           }
                        }
                     },
                     "match": "*",
                     "match_mapping_type": "string"
                  }
               },
               {
                  "float_fields": {
                     "mapping": {
                        "type": "float",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "float"
                  }
               },
               {
                  "double_fields": {
                     "mapping": {
                        "type": "double",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "double"
                  }
               },
               {
                  "byte_fields": {
                     "mapping": {
                        "type": "byte",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "byte"
                  }
               },
               {
                  "short_fields": {
                     "mapping": {
                        "type": "short",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "short"
                  }
               },
               {
                  "integer_fields": {
                     "mapping": {
                        "type": "integer",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "integer"
                  }
               },
               {
                  "long_fields": {
                     "mapping": {
                        "type": "long",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "long"
                  }
               },
               {
                  "date_fields": {
                     "mapping": {
                        "type": "date",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "date"
                  }
               },
               {
                  "geo_point_fields": {
                     "mapping": {
                        "type": "geo_point",
                        "doc_values": true
                     },
                     "match": "*",
                     "match_mapping_type": "geo_point"
                  }
               }

I think this might be the same issue as of https://github.com/elastic/elasticsearch/pull/22174

I tested it with 5.3 and master and it looks like it's fixed.

As a workaround you can add the explicit date format to the dynamic mapping for the date field:

                "date_fields": {
                    "mapping": {
                        "type": "date",
                        "format": "yyyy/MM/dd",
                        "doc_values": true
                    },
                    "match": "*",
                    "match_mapping_type": "date"
                }

Ok great. Thanks.
I'll try adding that.
Quick follow-up, is there any in-depth documentation on how the mapping configs work? I'm wanting to be able to find the answers to some of my question myself, but I've had a hard time learning the specifics on what changes to the mapping template causes what effects. I'm wanting to know a little more about what setting the explicit format field will do. What kind of negative effects could come of it?
Is there a chance it could dynamically identify dates of other formats, and then fail to parse again because it has a different hard-coded date format? Does the format field there take multiple values? And if so, in what format?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.