Mapping multiple date formats


(searchersteve) #1

In creating a _mapping, I am using the || operator to allow for indexing multiple formats in a date field. I am following procedures described here in the guide. My problem: The order in which I list the formats matters. For example...

If I specify:

"post_date" : {"type" : "date","format" : "yyyy-MM-dd HH:mm:ss||MM/dd/yyyy||MM-dd-yyyy"}

Then I can successfully index:

"post_date" : "2009-11-15 14:12:12"

If I specify:

 "post_date" : {"type" : "date","format" : "MM/dd/yyyy||MM-dd-yyyy||yyyy-MM-dd HH:mm:ss"}

Then indexing the same post_date fails.

The full code example is at: https://gist.github.com/24bee84dbeae5359583f

I do see in the documentation that the first format listed has a different functionality. It says, "The first format will also act as the one that converts back from milliseconds to a string representation." But I don't see why this would cause the behavior described above.

Thoughts?


(Shay Banon) #2

Thats a bug in ES (thanks for the simple repro), created an issue or it: https://github.com/elasticsearch/elasticsearch/issues/977 and pushed a fix to 0.16 and master branches.

On Friday, May 27, 2011 at 11:29 PM, searchersteve wrote:

In creating a _mapping, I am using the || operator to allow for indexing
multiple formats in a date field. I am following procedures described
http://www.elasticsearch.org/guide/reference/mapping/date-format.html here
in the guide. My problem: If I allow for too many formats, parsing fails.
For example...

If I specify:

"post_date" : {"type" : "date","format" : "MM-dd-yyyy||yyyy-MM-dd
HH:mm:ss"}

Then I can successfully index:

"post_date" : "2009-11-15 14:12:12"

If I specify:

"post_date" : {"type" : "date","format" :
"MM/dd/yyyy||MM-dd-yyyy||yyyy-MM-dd HH:mm:ss"}

Then indexing the same post_date fails.

The full code example is at: https://gist.github.com/24bee84dbeae5359583f
https://gist.github.com/24bee84dbeae5359583f

It makes sense that at some point, a parser will get confused if you throw
too many allowed formats at it. You end up with potential ambiguity that the
parser cannot resolve. But I can't see the ambiguity in this case.

Thoughts?

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Mapping-multiple-date-formats-tp2994323p2994323.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com (http://Nabble.com).


(system) #3