Checking for missing date fields in painless script

Elasticsearch 6.4.3

I have an index file synced from PostgreSQL with the original timestamp created_at configured as date type. At some later moment, a new timestamp event_timestamp (also date type) has been introduces to the documents and I want to switch to use that as the index primary timestamp in Kibana.

Now, I am trying to update the index in place by making the _update_by_query query such that would fill event_timestamp with the value of created_at everywhere where it is missing. But I fail to construct the condition for missing date value in painless script.

When I try running

POST dev-order-events-000001/_update_by_query
{
    "script": {
        "lang": "painless",
        "source": """
           if (ctx._source.event_timestamp == 0) {
              ctx._source.event_timestamp = ctx._source.created_at;
           } else {  
              ctx.op = "noop";
           }
        """
    }
}

the script does not find any match. Checking for ctx._source.event_timestamp.size() == 0 results in the error

{
   "error": {
      "root_cause": [
        {
           "type": "script_exception",
           "reason": "runtime error",
           "script_stack": [
              "if (ctx._source.event_timestamp.size() == 0) {\n        ",
              "                               ^---- HERE"
           ],
           "script": "      if (ctx._source.event_timestamp.size() == 0) {\n        ctx._source.event_timestamp = ctx._spource.created_at;\n      } else {  \n        ctx.op = \"noop\";\n      }",
           "lang": "painless"
        }
     ],
     "type": "script_exception",
     "reason": "runtime error",
     "script_stack": [
        "if (ctx._source.event_timestamp.size() == 0) {\n        ",
        "                               ^---- HERE"
     ],
    "script": "      if (ctx._source.event_timestamp.size() == 0) {\n        ctx._source.event_timestamp = ctx._spource.created_at;\n      } else {  \n        ctx.op = \"noop\";\n      }",
    "lang": "painless",
    "caused_by": {
       "type": "illegal_argument_exception",
       "reason": "Unable to find dynamic method [size] with [0] arguments for class [java.lang.Long]."
    }
 },
"status": 500
}

Checking for doc['event_timestamp'].size() == 0 also does not go through:

{
   "error": {
      "root_cause": [
        {
           "type": "script_exception",
           "reason": "runtime error",
           "script_stack": [
              "if (doc['event_timestamp'].size() == 0) {\n        ",
              "        ^---- HERE"
           ],
           "script": "      if (doc['event_timestamp'].size() == 0) {\n        ctx._source.event_timestamp = ctx._spource.created_at;\n      } else {  \n        ctx.op = \"noop\";\n      }",
           "lang": "painless"
        }
     ],
     "type": "script_exception",
     "reason": "runtime error",
     "script_stack": [
        "if (doc['event_timestamp'].size() == 0) {\n        ",
        "        ^---- HERE"
     ],
     "script": "      if (doc['event_timestamp'].size() == 0) {\n        ctx._source.event_timestamp = ctx._spource.created_at;\n      } else {  \n        ctx.op = \"noop\";\n      }",
     "lang": "painless",
     "caused_by": {
        "type": "null_pointer_exception",
        "reason": null
     }
  },
  "status": 500
}

Description of the missing date field in 6.4/painless-examples.html is very "cryptic" without specifying returned value.

If you request the value from a field field that isn’t in the document, doc['field'].value for this document returns:
...
epoch date if a field has a date datatype

And the "officially recommended" solution tom check for missing values mentioned there is

To check if a document is missing a value, you can call doc['field'].size() == 0 .

which I found to be not working. :disappointed:

What am I possibly doing wrong?

Are you sure event_timestamp doesn't exist as a field for any documents? To check if the field exists in source, you would use containsKey:

if (ctx._source.containsKey('event_timestamp')) {
  // event timestamp exists
} else {
  // even timestamp does not exist
}

Regarding doc, that is special variable only available in some script contexts. Unfortunately, update scripts are not one of them.

Thank you so much, @rjernst!
It is containsKey method that I was looking for. After that, the script

POST dev-order-events-000001/_update_by_query
  {
     "script": {
        "lang": "painless",
        "source": """
           if (ctx._source.containsKey('event_timestamp')) {
              ctx.op = "noop";
           } else {  
              ctx._source.event_timestamp = ctx._source.created_at;
           }
        """
      }
   } 

worked like a charm. :+1:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.