5.0 - Comparing dates in painless


(Blake Niemyjski) #1

I'm upgrading from groovy to painless and it's not so painless so far. I wanted to post this here as I think this would be a pretty common scenario and I'm having trouble with dates (gotta love dates). Ideas?

PUT /test-stacks-v1
{
  "aliases": {
    "test-stacks": {}
  },
  "mappings": {
    "stacks": {
      "_all": {
        "enabled": false
      },
      "dynamic": false,
      "properties": {
        "id": {
          "type": "keyword"
        },
        "first_occurrence": {
          "type": "date"
        },
        "last_occurrence": {
          "type": "date"
        },
        "total_occurrences": {
          "type": "float"
        }
      }
    }
  }
}

PUT /test-stacks/stacks/1ecd0826e447a44e78877ab1
{
  "id": "1ecd0826e447a44e78877ab1",
  "total_occurrences": 0,
  "first_occurrence": "0001-01-01T00:00:00",
  "last_occurrence": "0001-01-01T00:00:00"
}

POST /test-stacks/stacks/1ecd0826e447a44e78877ab1/_update?retry_on_conflict=3
{
  "script": { 
    "params": {
      "minOccurrenceDateUtc": "2016-10-25T22:04:19.3272056Z",
      "maxOccurrenceDateUtc": "2016-10-25T22:04:19.3272056Z",
      "count": 1
    },
    "inline": "if (ctx._source.total_occurrences == 0 || ctx._source.first_occurrence > params.minOccurrenceDateUtc) { ctx._source.first_occurrence = params.minOccurrenceDateUtc; } if (ctx._source.last_occurrence < params.maxOccurrenceDateUtc) { ctx._source.last_occurrence = params.maxOccurrenceDateUtc; } ctx._source.total_occurrences += params.count;"
  }
}

The last script update returns

json
{
"error": {
"root_cause": [
{
"type": "remote_transport_exception",
"reason": "[8mJiEA3][local[1]][indices:data/write/update[s]]"
}
],
"type": "illegal_argument_exception",
"reason": "failed to execute script",
"caused_by": {
"type": "script_exception",
"reason": "runtime error",
"caused_by": {
"type": "class_cast_exception",
"reason": "Cannot apply [<] operation to types [java.lang.String] and [java.lang.String]."
},
"script_stack": [
"if (ctx._source.last_occurrence < params.maxOccurrenceDateUtc) { ",
" ^---- HERE"
],
"script": "if (ctx._source.total_occurrences == 0 || ctx._source.first_occurrence > params.minOccurrenceDateUtc) { ctx._source.first_occurrence = params.minOccurrenceDateUtc; } if (ctx._source.last_occurrence < params.maxOccurrenceDateUtc) { ctx._source.last_occurrence = params.maxOccurrenceDateUtc; } ctx._source.total_occurrences += params.count;",
"lang": "painless"
}
},
"status": 400
}


The original groovy script worked flawlessly in 1.7.5 (including linebreaks)
```cs
            var request = new UpdateRequest<Stack, Stack>(GetIndexById(stackId), ElasticType.Type, stackId) {
                RetryOnConflict = 3,
                Script = @"if (ctx._source.total_occurrences == 0 || ctx._source.first_occurrence > minOccurrenceDateUtc) {
                            ctx._source.first_occurrence = minOccurrenceDateUtc;
                            }
                            if (ctx._source.last_occurrence < maxOccurrenceDateUtc) {
                            ctx._source.last_occurrence = maxOccurrenceDateUtc;
                            }
                            ctx._source.total_occurrences += count;",
                Params = new Dictionary<string, object>(3) {
                    { "minOccurrenceDateUtc", minOccurrenceDateUtc },
                    { "maxOccurrenceDateUtc", maxOccurrenceDateUtc },
                    { "count", count }
                }
            };

(Blake Niemyjski) #2

Looks like it's almost impossible to parse an ISO-8601 date with painless (http://stackoverflow.com/questions/2201925/converting-iso-8601-compliant-string-to-java-util-date) am I missing something?


(Jack Conradson) #3

Maybe you already found this, but all supported classes can be found in our whitelist here (https://github.com/elastic/elasticsearch/tree/master/modules/lang-painless/src/main/resources/org/elasticsearch/painless)

Specifically take a look at (https://github.com/elastic/elasticsearch/blob/master/modules/lang-painless/src/main/resources/org/elasticsearch/painless/java.time.format.txt). I'm hoping DateTimeFormatter will be able to parse what you are doing.

The docs for the class you want are here (https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html) and it appears they will indeed format DateTime's using ISO 8601. That stack overflow page is referring to an older version of Java.

I understand this isn't as simple as the Groovy way to do things, but ours goals were to be both secure and performant, so we may not always provide things the same way that Groovy does.


(Blake Niemyjski) #4

To anyone wondering I was able to get this working at least locally, I'm not sure how reliable it is at parsing different date formats as I can't specify Zulu because min date time doesn't have it (pre epoch)

def sf = new SimpleDateFormat(\""yyyy-MM-dd'T'HH:mm:ss\"");
if (ctx._source.total_occurrences == 0 || sf.parse(ctx._source.first_occurrence).after(sf.parse(params.minOccurrenceDateUtc))) {
  ctx._source.first_occurrence = params.minOccurrenceDateUtc;
}
if (sf.parse(ctx._source.last_occurrence).before(sf.parse(params.maxOccurrenceDateUtc))) {
  ctx._source.last_occurrence = params.maxOccurrenceDateUtc;
}
ctx._source.total_occurrences += params.count;

(Blake Niemyjski) #5

Thanks for your response, I created an issue here: https://github.com/elastic/elasticsearch/issues/21121 it would be good to add a date sample / this whitelist text to the docs. I found it after looking through the source code but it was a serious pita for a non java guy working with dates (last time I worked with java was about 10 years ago).


(Blake Niemyjski) #6

What do you think is better? SimpleDateFormat or LocalDate?

    "inline": "def formatter = DateTimeFormatter.ISO_INSTANT; if (ctx._source.total_occurrences == 0 || LocalDate.parse(ctx._source.first_occurrence,formatter).isAfter(LocalDate.parse(params.minOccurrenceDateUtc,formatter))) { ctx._source.first_occurrence = params.minOccurrenceDateUtc; } if (LocalDate.parse(ctx._source.last_occurrence,formatter).isBefore(LocalDate.parse(params.maxOccurrenceDateUtc,formatter))) { ctx._source.last_occurrence = params.maxOccurrenceDateUtc; } ctx._source.total_occurrences += params.count;"

Seems like there is no ISO8601 formatter that works out of the box as the above blows up when your time contains ms 2016-10-25T22:04:20.3272056Z

{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[8mJiEA3][local[1]][indices:data/write/update[s]]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "failed to execute script",
    "caused_by": {
      "type": "script_exception",
      "reason": "runtime error",
      "caused_by": {
        "type": "date_time_parse_exception",
        "reason": "Text '2016-10-25T22:04:19.3272056Z' could not be parsed: Unable to obtain LocalDate from TemporalAccessor: {NanoOfSecond=327205600, InstantSeconds=1477433059, MilliOfSecond=327, MicroOfSecond=327205},ISO of type java.time.format.Parsed",
        "caused_by": {
          "type": "date_time_exception",
          "reason": "Unable to obtain LocalDate from TemporalAccessor: {NanoOfSecond=327205600, InstantSeconds=1477433059, MilliOfSecond=327, MicroOfSecond=327205},ISO of type java.time.format.Parsed"
        }
      },
      "script_stack": [
        "java.time.format.DateTimeFormatter.createError(DateTimeFormatter.java:1920)",
        "java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1855)",
        "java.time.LocalDate.parse(LocalDate.java:400)",
        "if (ctx._source.total_occurrences == 0 || LocalDate.parse(ctx._source.first_occurrence,formatter).isAfter(LocalDate.parse(params.minOccurrenceDateUtc,formatter))) { ",
        "                                                                                       ^---- HERE"
      ],
      "script": "def formatter = DateTimeFormatter.ISO_INSTANT; if (ctx._source.total_occurrences == 0 || LocalDate.parse(ctx._source.first_occurrence,formatter).isAfter(LocalDate.parse(params.minOccurrenceDateUtc,formatter))) { ctx._source.first_occurrence = params.minOccurrenceDateUtc; } if (LocalDate.parse(ctx._source.last_occurrence,formatter).isBefore(LocalDate.parse(params.maxOccurrenceDateUtc,formatter))) { ctx._source.last_occurrence = params.maxOccurrenceDateUtc; } ctx._source.total_occurrences += params.count;",
      "lang": "painless"
    }
  },
  "status": 400
}

(Jack Conradson) #7

I just looked through the classes as well. Looks like they include all the formats except that final one with the YYYY-MM-DDThh:mm:ss.sTZD. I believe it can be done with this crazy looking Java class DateTimeFormatterBuilder. (https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html) I appreciate you sort of being a guinea pig here for trying out these Java Date classes. Documentation for this is definitely on my list of things to do.


(Blake Niemyjski) #8

Thanks I'll take a look, it also looks like Instant can parse it to except when the date is "0001-01-01T00:00:00"


(Blake Niemyjski) #9
  "type": "script_exception",
  "reason": "runtime error",
  "caused_by": {
    "type": "null_pointer_exception",
    "reason": null
  },
  "script_stack": [
    "if (!!ctx._source.last_occurrence && Instant.parse(ctx._source.last_occurrence).isBefore(Instant.parse(params.maxOccurrenceDateUtc))) { ",
    "                 ^---- HERE"
  ],

I guess you can't use !! for implicit bool conversions?


(Blake Niemyjski) #10

I get an error with that pattern:

  "type": "script_exception",
      "reason": "runtime error",
      "caused_by": {
        "type": "illegal_argument_exception",
        "reason": "Unknown pattern letter: T"

(Blake Niemyjski) #11

This seems like it could really help, but I don't think DateTimeFormat is allowed http://stackoverflow.com/questions/3307330/using-joda-date-time-api-to-parse-multiple-formats

Without this script I'm completely blocked and it looks less and less likely I'll be able to move to painless :frowning:


(Blake Niemyjski) #12

I'm almost thinking I'm going to have to create a local function that returns a numeric data type and return the tick count after tying to parse with try catches as I can't get multiple formats to work.


(Blake Niemyjski) #13

After many hours of struggling I think I have got it...

Instant parseDate(def dt) {
  if (dt != null) {
    try {
      return Instant.parse(dt);
    } catch(DateTimeParseException e) {}
  }
  return Instant.MIN;
}

if (ctx._source.total_occurrences == 0 || parseDate(ctx._source.first_occurrence).isAfter(parseDate(params.minOccurrenceDateUtc))) {
  ctx._source.first_occurrence = params.minOccurrenceDateUtc;
}
if (parseDate(ctx._source.last_occurrence).isBefore(parseDate(params.maxOccurrenceDateUtc))) {
  ctx._source.last_occurrence = params.maxOccurrenceDateUtc;
}
ctx._source.total_occurrences += params.count;

I created a function to try and parse out an Instant (always utc) and I've checked to make sure it never loses any milli/nano seconds. :slight_smile:

We might want to document this somewhere..


(Blake Niemyjski) #14

@Jack_Conradson Some things I found here, after working with painless here are some of my thoughts when coming from JS/groovy

  1. Couldn't use the !! to coerce an object to a boolean.
  2. I couldn't have an empty catch block (I also tried to do def e to catch any error but that didn't work either :D); Might even be nice to just do only try { }
  3. Anyway to add trace messages for debugging?

(Jack Conradson) #15

Sorry for the delayed response. I'll try to answer these in order.

  1. We do not allow coercion because in most cases it doesn't make sense. To me, this is a very broken feature in Groovy. What does is mean to make a Map a boolean, empty or null? Some other arbitrary criteria? What about a String?

  2. Not sure what you mean by empty here; does the script you provided not compile correctly? If it really doesn't then that's a bug. If you mean you can't do catch () {...} that's intentional. If you want a catch all you can use Exception, but the user must define what type of Exception to catch. Using def here doesn't really make sense in Painless.

  3. Only way to debug right now is to throw an exception (like IllegalArgumentException). This is something we are going to fix in the future, but it's going to take some extra time.

Again, sorry for the difficultly you had with using dates, but thanks for working through it!


(Blake Niemyjski) #16
  1. Like in javascript there are clear cases where !! coercion into a boolean (!! will be true if the value exists) but I agree it could be a little goofy.
  2. I wanted a generic catch all try block. I'm not a java programmer I don't know all the exception types that will get thrown. I looked at the JavaDocs and added a try block for just the one they listed, but I know deep down there are other exceptions they will throw... I just wanted a catch all so I could fallback to my default behavior.
  3. I can't wait :slight_smile:

(Nik Everett) #17

For reference the way to do that is: try { } catch (Exception e) {}

I'm fairly torn about what you should stick in that exception handling block. It'd be great if you could log something in there but we don't support that. There is some paranoia around scripts do any IO and logging counts.

I filed


for us to create a suite of practical examples for painless. We have a few but we really need a boat load of them....


(Jack Conradson) #18

I should've have explained the Exception handling a bit more because as Nik pointed out Exception is the generic catch all for Painless. There are a few things that cannot be caught by the user such as the loop protection error when attempting to stop a runaway loop.

As for coercion, I could see us having a potentially custom operator for checking if something exists or not because it could be nice to check something against null as a shortcut. I tend to lean more towards the Java side rather than scripting side simply because less leniency leads to better performance since we can do more at compile-time.


(Blake Niemyjski) #19

I'm pretty sure I tired that exact try catch block and it complained... but I'll take your word for it :slight_smile:

This (https://itunes.apple.com/app/apple-store/id922793622?pt=814382&mt=8&ct=how_i_email) is how I Email now


(Blake Niemyjski) #20

@Jack_Conradson I'm now trying to work on a painless script (for daily index migration ) and I have a date stored as '2017-01-03T14:12:36.2770782-06:00' Is there a good way to get the utc time? I'm back to trying to parse this with out very much luck.. I've tried to convert the doc.date to a date and instant and get errors https://github.com/elastic/elasticsearch/issues/21121 converting long to java long..


Problem looping through array in each doc with Painless