Date range not working as expected between Elasticsearch 7.17 and Elasticsearch 8.6

We're exploring a move from (self-managed) Elasticsearch 7.17 to Elasticsearch 8.6. Some of the tests on the application fail if we switch. We traced the failure to an inconsistency between the results on the range using the date field. The range query is working properly in 7.17, but it stops getting results for 8.6. The mappings on the field are unchanged. Reproducible example below.

Preparing the data

DELETE /test_index

PUT test_index
{
  "mappings": {
    "properties": {
      "published_date": {
        "type" : "date",
        "format" : "epoch_second"
      },
      "name": {
        "type": "text"
      }
    }
  }
}

POST test_index/_bulk
{ "index" : { "_id" : "1" } }
{"published_date" : 1923354000, "name": "Nike Shoes"}
{ "index" : { "_id" : "2" } }
{"published_date" : 1923354000, "name": "Adidas Top"}
{ "index" : { "_id" : "3" } }
{"published_date" : 1355360400, "name": "Valentino Bag"}

Not working as expected

POST test_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "published_date": {
              "gte": 0,
              "lte": 2516523214
            }
          }
        }
      ]
    }
  }
}

Oops

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

This returns an empty set, though all the results are supposed to match.
However, the same query returns the expected result, meaning all three records, on Elasticsearch 7.17.

The query can be tweaked like this and it will work on 8.6 too

POST test_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "published_date": {
              "gte": "0",
              "lte": "2516523214"
            }
          }
        }
      ]
    }
  }
}

The change to the query is very small; it's just quoting the dates in the range, and sending them as strings.

(A side note: in Elasticsearch 7.17, this query works on par with the previous one. I haven't noticed any changes between using strings in this range, or using numbers, for 7.17. But for 8.6, numbers just do not work.)

However, it's not an insignificant change for our app; it requires quite a significant rewrite of code and test cases, and possibly the client apps. I wasn't able to find anything about this change in the listed Java time migration changes.

Is this a change that's described somewhere and was intentional, or is this something that broke between the versions? It's a bit weird that we need to query for the basically numeric values as strings.

I'm afraid this is kind of a feature: Numbers are interpreted as ms-since-epoche but you have seconds. Only strings are evaluated against the format of the mapping — see the docs.

Another workaround could be to add "format": "epoch_second" to the query but I assume that will be just as convoluted as changing the number to a string...

POST test_index/_search
{
  "explain": true, 
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "published_date": {
              "gte": 0,
              "lte": 2516523214,
              "format": "epoch_second"
            }
          }
        }
      ]
    }
  }
}

Fix range query on date fields for number inputs by cbuescher · Pull Request #63692 · elastic/elasticsearch · GitHub is the relevant change here.

1 Like

@xeraa Thanks for the response! But if it's a feature, why has this changed v7 to v8? v7 worked well with both the numbers and strings in a query.

The change causing this is indeed #63692 and its a breaking change in 8.0 that we documented in the migration docs and listed in the breaking changes in the release notes. I see why with “epoch_seconds” as a field format, this can be confusing, but specifying the format on the query itself should fix this.
For context: the original issue this change fixed is this one: range query on dates can see small numbers of millis as a year · Issue #63680 · elastic/elasticsearch · GitHub. Users were complaining that if they do range searches with numeric values that should be interpreted as millis_epoch (our default internal storage format) we might mistake them for years because we internally parse every. To remedy this, we decided to interpret all numeric inputs as epoch_millis when no explicit format is given in the query. You can either quote the value as a string now or specify the format explicitely in the query as @xeraa already suggested.
If you think this behavior should be changed back again we're certainly open for discussing this via an issue on Github. However, since "epoch_millis" is our default numeric format and every decision we make with regards to "unparsed" numeric inputs will have to lean either way, I'm not sure we want to change that to seconds again.

3 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.