Null date field is epoch in painless


(Frederico Galvão) #1

I have an index with a mapping with a field like this:

"some_date_field": {
	"type": "date"
}

and when that field has a value of null, through a painless script it comes as a DateTime representing the epoch (and some other strange behaviour I didn't know before), which just confused me for a couple of hours.

The following query:

{
  "size": 1,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must_not": [
            {
              "exists": {
                "field": "some_date_field"
              }
            }
          ]
        }
      }
    }
  },
  "_source": [
    "some_date_field"
  ],
  "script_fields": {
    "s": {
      "script": "params._source['some_date_field']"
    },
    "d.f.values": {
      "script": "doc['some_date_field'].values"
    },
    "d.f.values.string": {
      "script": "doc['some_date_field'].values.toString()"
    },
    "d.f.value": {
      "script": "doc['some_date_field'].value"
    },
    "d.f.value.string": {
      "script": "doc['some_date_field'].value.toString()"
    }
  }
}

gives me the following response:

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 6763,
    "max_score": 1,
    "hits": [
      {
        "_index": "XXX",
        "_type": "XXX",
        "_id": "XXX",
        "_score": 1,
        "_source": {
          "some_date_field": null
        },
        "fields": {
          "s": [
            null
          ],
          "d.f.values.string": [
            "[]"
          ],
          "d.f.value": [
            "1970-01-01T00:00:00.000Z"
          ],
          "d.f.value.string": [
            "1970-01-01T00:00:00.000Z"
          ]
        }
      }
    ]
  }
}

Notice that:

  • [expected] the source has null, everywhere
  • [unexpected] d.f.values is not on the response (empty lists are not a valid value?)
  • [unexpected] d.f.values.string shows that d.f.values is an empty list
  • [unexpected] d.f.value comes as the string of the epoch DateTime
  • [unexpected] d.f.value.string comes as the string of the epoch DateTime, showing that it was not a post-processing transforming null into that value (it is that value inside the script context)

This is part of a migration from 2.4.6 to 6.3.0, which made me realize that a null date field doesn't have the same behaviour it used to have on groovy scripts when on boolean contexts.

I see/know though (I've just tested on my old cluster) that in groovy/es@2.4, date fields used to be represented as 0 whenever d.f.value was obtained, which is the millis equivalent to the epoch, making the new behaviour kinda expected with regards to compatibility. However, neither ES versions have anything on the docs mentioning that a null value on date fields will be handled differently (the null_value section on the date datatype says it will be handled as null || missing), so I'm willing to say that this has always been confusing or wrong.

If this is by design, I'd love to be lectured on why, and the docs could gain some extra words to help clarify this.


(Ryan Ernst) #2

null cannot be indexed, since it means a value does not exist. The behavior you see is because of how scripts currently expose doc values, but this is changing to throw an error instead of returning a "default" value (epoch for date fields as you have seen). You can check if a field has any values by looking at the size of the object returned from doc in a script, eg doc['myfield'].size() == 0.


(Frederico Galvão) #3

Nice, that snippet is the suggestion I was looking for but couldn't find on the Painless API. I really think it should be noted on the docs as much as possible.

Nice to see that at least it'll throw an error instead of returning a fake value.

Thanks for your time!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.