Safe numerical return value in scripted fields. Avoiding "null pointers" in filters using scripted fields

Hi,

TL/DR: Is there a safe return value, for a numeric scripted field, that would behave like a non-existing field in a document (for filtering purposes)?

Btw, I think this ends up relating to another topic I posted.

For instance, with the script:

def audienciaMilis = doc['data_audiencia_pendente'].value.getMillis();
if (audienciaMilis == null || audienciaMilis == 0)
  return null;
ZoneId timeZone = ZoneId.of(ZoneId.SHORT_IDS.get('BET'));
LocalDate dataAudiencia = LocalDateTime.ofInstant(Instant.ofEpochMilli(audienciaMilis),timeZone).toLocalDate().withDayOfMonth(1);
LocalDate dataNow = Instant.ofEpochMilli(new Date().getTime()).atZone(timeZone).toLocalDate().withDayOfMonth(1);
return ChronoUnit.MONTHS.between(dataNow,dataAudiencia);

If I filter it like this (taking the DSL version of a filter created on the dashboard -- this is commented while the version above isn't):

{
  "script": {
    "script": {
      "inline": "boolean gte(Supplier s, def v) {return s.get() >= v} boolean lt(Supplier s, def v) {return s.get() < v}gte(() -> { // Busca a data de registro do processo. Nunca deveria ser nula ou zero, mas se for retorna nulo.\ndef audienciaMilis = doc['data_audiencia_pendente'].value.getMillis();\nif (audienciaMilis == null || audienciaMilis == 0)\n  return null;\n\n// Instancia uma timezone com o horário de Brasilia e duas datas, uma para o início do mẽs da audiência\n// e uma para o início do mês corrente\nZoneId timeZone = ZoneId.of(ZoneId.SHORT_IDS.get('BET'));\n\nLocalDate dataAudiencia = LocalDateTime.ofInstant(Instant.ofEpochMilli(audienciaMilis),timeZone).toLocalDate().withDayOfMonth(1);\nLocalDate dataNow = Instant.ofEpochMilli(new Date().getTime()).atZone(timeZone).toLocalDate().withDayOfMonth(1);\n\nreturn ChronoUnit.MONTHS.between(dataNow,dataAudiencia);\n }, params.gte) && lt(() -> { // Busca a data de registro do processo. Nunca deveria ser nula ou zero, mas se for retorna nulo.\ndef audienciaMilis = doc['data_audiencia_pendente'].value.getMillis();\nif (audienciaMilis == null || audienciaMilis == 0)\n  return null;\n\n// Instancia uma timezone com o horário de Brasilia e duas datas, uma para o início do mẽs da audiência\n// e uma para o início do mês corrente\nZoneId timeZone = ZoneId.of(ZoneId.SHORT_IDS.get('BET'));\n\nLocalDate dataAudiencia = LocalDateTime.ofInstant(Instant.ofEpochMilli(audienciaMilis),timeZone).toLocalDate().withDayOfMonth(1);\nLocalDate dataNow = Instant.ofEpochMilli(new Date().getTime()).atZone(timeZone).toLocalDate().withDayOfMonth(1);\n\nreturn ChronoUnit.MONTHS.between(dataNow,dataAudiencia);\n }, params.lt)",
      "params": {
        "gte": -2,
        "lt": 2,
        "value": ">=-2 <2"
      },
      "lang": "painless"
    }
  }
}

It will throw a null pointer exception and I get why. There are documents which do not have a "data_audiencia_pendente" so they'll return null, which on turn can't be compared to values.

So, going back to the question, what would be a safe way to do this that allows filters to work even with non-existent values (they do for docs with non-existing fields which are not scripted). In this case I cannot assign a negative or zero value because those are actually valid responses from the script.

Thank you very much for any kind of insight here.

Hi Erick,

I think you want to check if your value exists first and only return a value for that case something like this;

if ( doc['system.cpu.nice.pct'].size() > 0 ) {
    return (doc['system.cpu.nice.pct'].value * 2)
}

I got the tip to check .size() from an error message returned by Elasticsearch. In your case you might need to check if the size if each of the fields you're referencing in your calculations are greater than zero.

Regards,
Lee

1 Like

Hi, thanks. We've switched from checking nulls to checking for size. Thank you very much.

Still it doesn't really solve the problem of actually wanting to return a null value to a scripted field and not having filters freak out with that. Any possible solutions there?

Can you just add another filter for "exists"?

{
  "exists": {
    "field": "your field here"
  }
}

I'm afraid that doesn't really work. Not really sure why since it doesn't even throw an error.


That is very strange. Can you go to the Inspect menu on Discover and see the Request sent to Elasticsearch? It should have a section like this;

  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "log.file.path"
          }
        },

A word about preventing the null pointer exception, you should check if the doc contains the key:

doc.containsKey('data_audiencia_pendente')

data_audiencia_pendente isn't a scripted field, right?
cc_meses_para_audiencia_pendente is, right?

Sorry for the delay. Got bogged down by the switch from 6.x to 7.x
Apparently, it does create the request correctly:

"query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "cc_meses_para_audiencia_pendente"
          }
        },

And here's the current incarnation of that scripted field btw:

if(doc['data_audiencia_pendente'].size() <= 0)
  return null;
def audienciaMilis = doc['data_audiencia_pendente'].value.getMillis();
ZoneId timeZone = ZoneId.of(ZoneId.SHORT_IDS.get('BET'));
LocalDate dataAudiencia = LocalDateTime.ofInstant(Instant.ofEpochMilli(audienciaMilis),timeZone).toLocalDate().withDayOfMonth(1);
LocalDate dataNow = Instant.ofEpochMilli(new Date().getTime()).atZone(timeZone).toLocalDate().withDayOfMonth(1);
return ChronoUnit.MONTHS.between(dataNow,dataAudiencia);

It did change behavior however: Now it simply does nothing when I check for existence or try to limit it.


That's right, data_audiencia_pendente is a regular ingested date field. What's the difference between checking for the key or checking for the size?
We just updated all our hundreds of scripts to check for size instead of checking for null because of Elastic 7.x

Checking for size is another way to do it. so it's fine

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.