Range queries with custom format dates

Hi,
I am having some troubles using single and multiple range clauses in queries with date fields that use dynamic mappings.
I frist create an empty index with the following dynamic mappings, including a custom date format:

PUT /new_index
{
  "mappings": {
    "dynamic_date_formats":["YYYY-MM-DD'_'HH:mm:ss.SS", "date_optional_time", "basic_ordinal_date_time", "basic_date_time_no_millis"]
  }
}

Then, the following query:

GET /new_index/_search

{
    "_source": [
        "validity_start"
    ],
    "query": {
        "bool": {
            "must": {
                "range": {
                    "validity_start": {
                        "lt": "2021-12-04_23:57:50.406",
                        "gte": "2021-12-04_23:56:18.406"
                    }
                }
            }
        }
    }
}

Among the results, I am getting both correct hits and out of range documents. For example:

{
...
validity_start: "2021-09-30_23:57:19.992"
...
}

The mapping for the field validity_start reads as:

GET /new_index/_mappings
...
{
        "mappings": {
            "dynamic_date_formats": [
                "YYYY-MM-dd'_'HH:mm:ss.SSS",
                "date_optional_time",
                "basic_ordinal_date_time",
                "basic_date_time_no_millis"
            ],
            "properties": {
                   ....
                        "validity_start": {
                            "type": "date",
                            "format": "YYYY-MM-dd'_'HH:mm:ss.SSS"
                        },
                   ....
             }
              
}

What am I doing wrong? Thanks.

Hi again,
After some tries with still no luck, I figured out that the issue seems to be in how the dates are converted into milliseconds in Elasticsearch. For example, for the value "2021-11-16_00:04:59.655" I am getting a millisecond value of 1609718699655. If I convert this value back, I can see that it is translated to "Monday, January 4, 2021 12:04:59.655 AM". In particular, I can see that the part of the date after the '_' is correctly translated (hours, minutes, seconds and millis are the same), but the YYYY-MM-DD part is completely wrong, hence my strange results. This holds for every other wrong hit I am getting.

Can anybody help please?

"YYYY-MM-DD''HH:mm:ss.SS" is really correct?
"2021-09-30_23:57:19.992" was not mapped as datetime with the dynamic_date_formats specification of "YYYY-MM-DD'
'HH:mm:ss.SS"`. Mapped as keyword.
I suppose your intent is "yyyy-MM-dd_HH:mm:ss.SSS", isn't it? If so, I've got desired output. 'D' is a letter for 'day-of-year' and that may be a reason of the strange results.

Take care about cases because DateTimeFormatter is CASE SENSITIVE. See DateTimeFormatter's page, which is also linked from elasticsearch document.
https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html

Thanks for the answer.
I realized that I wrote the original PUT request in a wrong way. The custom date format is indeed "YYYY-MM-dd'_'HH:mm:ss.SSS", which is the same one that the GET /mappings request shows. So I guess that the DOY is not the issue here, while I see that also the 'Y' should be 'y' as you suggested as well.
Trying to perform the same query with the additional "format" parameter in the range clause actually gives me correct results, despite still not being the ones I am expecting (only a subset). I guess it's a problem of how data has been indexed the very first time.
I'll try to reindex on a new index with the custom format set to "yyyy-MM-dd'
'HH:mm:ss.SSS" and see if the issue is solved.
Thanks a lot, I really don't know how I did not notice that uppercase/lowercase difference!

1 Like

Well that was it apparently. Reindexing on a new index with the custom date format set to yyyy-MM-dd'_ 'HH:mm:ss.SSS solved the issue and I am now getting correct results.
Thanks again for the help.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.