Null pointer exeption while creating a scripted metric

Hi, I am creating a scripted metric to use it later for a transfrom. I am trying to get the count of the fields with a specific value, but i receive the next error:

{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "state.errores.add(doc['TIPO_ERROR.keyword'].value=='Tecnológico' ? 1:0)",
          "                                           ^---- HERE"
        ],
        "script": "state.errores.add(doc['TIPO_ERROR.keyword'].value=='Tecnológico' ? 1:0)",
        "lang": "painless"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "tandem_atm-0000001",
        "node": "KAgsKQnWQZGdUnUZN1itEA",
        "reason": {
          "type": "script_exception",
          "reason": "runtime error",
          "script_stack": [
            "state.errores.add(doc['TIPO_ERROR.keyword'].value=='Tecnológico' ? 1:0)",
            "                                           ^---- HERE"
          ],
          "script": "state.errores.add(doc['TIPO_ERROR.keyword'].value=='Tecnológico' ? 1:0)",
          "lang": "painless",
          "caused_by": {
            "type": "null_pointer_exception",
            "reason": null
          }
        }
      }
    ]
  },
  "status": 400
}

Here is the code that i execute:

POST atm/_search
{
  "size": 0,
  "aggs":{
    "conteo":{
      "scripted_metric": {
        "init_script": "state.erorres = []",
        "map_script": "state.errores.add(doc['TIPO_ERROR.keyword'].value=='Tecnológico' ? 1:0)",
        "combine_script": "int conteo = 0; for (e in state.errores) { conteo += e } return conteo",
        "reduce_script": "int conteo = 0; for (a in states) { conteo += a } return conteo"
      }
    }
  }
}

I know i can use the filter aggregation to do that but my intention is to create a transform, and transforms dont accept filter aggregation.

Also i create a scripted field with the same script to make tests and it worked, so i dont know why in the scripted metric doesnt work.

Sorry for the gramaticals mistakes, Thank you

please do not paste screenshots they are nearly impossible to read. this forum supports markdown formatting, which is super useful for code snippets.

You need to check if that field exists before using it.

Thank you for your answer, as i said, i created an scripted field with the same script without problem, and i am sure that this field exist. Thats why i dont understand the problem.

P.S I change the screenshot, sorry for that

try to recreate a minimal reproducible example. My assumption is, that a document from your result set does not have this field set.

And what can i do to identify if a document is missing that field ?

you can run a bool query, with a must_not clause, that contains an exists query to find documents that do not contain this field.

Hope that helps!

Yes you were right, some of my documents fail in the grok parse that logstash. What i didn't know is that logstash actually index the documents that failed in the grok. I find documents with the tag "_grokparsefailure" with your query. I guess i have to find a form to ignore the documents that failed in the grok, but that is a more logstash related question. Thank you Alexander for your help.

Hey,

you can check with something like

if "_grokparsefailure" not in [tags] {
} else {
}

in the output to either ignore such events or write them to another index (the above example is on top of my head).

You could also check in your scripts if the field exists and check for doc['TIPO_ERROR.keyword'].size() > 0 (also on top of my head)

--Alex

Hi, after doing some tests and do make changes i still get the error. Now i dont have documents without the field that i am trying to test. If i execute this query:

GET atm/_search
{
    "query": {
        "bool": {
            "must_not": {
                "exists": {
                    "field": "TIPO_ERROR"
                }
            }
        }
    }
}

This is the answer of elastic:

{
  "took" : 2070,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

So i asume all the documents have the field. So i execute this script:

POST atm/_search
{
  "aggs":{
    "conteo":{
      "scripted_metric": {
        "init_script": "state.erorres = []",
        "map_script": "if(doc['TIPO_ERROR.keyword'].size() > 0){state.errores.add(doc['TIPO_ERROR.keyword'].value =='Tecnológico' ? 1:0)}",
        "combine_script": "int conteo = 0; for (e in state.errores) { conteo += e } return conteo",
        "reduce_script": "int conteo = 0; for (a in states) { conteo += a } return conteo"
      }
    }
  }
}

But the response is still null_pointer_exception in the value, i really dont know what to do know, really aprecciate if you can help me. Thank you. I show you the response:

{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "state.errores.add(doc['TIPO_ERROR.keyword'].value =='Tecnológico' ? 1:0)}",
          "                                           ^---- HERE"
        ],
        "script": "if(doc['TIPO_ERROR.keyword'].size() > 0){state.errores.add(doc['TIPO_ERROR.keyword'].value =='Tecnológico' ? 1:0)}",
        "lang": "painless"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "tandem_atm-0000001",
        "node": "eo5tScCEQSSx_P3-TP3xMQ",
        "reason": {
          "type": "script_exception",
          "reason": "runtime error",
          "script_stack": [
            "state.errores.add(doc['TIPO_ERROR.keyword'].value =='Tecnológico' ? 1:0)}",
            "                                           ^---- HERE"
          ],
          "script": "if(doc['TIPO_ERROR.keyword'].size() > 0){state.errores.add(doc['TIPO_ERROR.keyword'].value =='Tecnológico' ? 1:0)}",
          "lang": "painless",
          "caused_by": {
            "type": "null_pointer_exception",
            "reason": null
          }
        }
      }
    ]
  },
  "status": 400
}

Soooooooo :slight_smile:

"init_script": "state.erorres = []",
"map_script": "... state.errores.add( ... )",

See the difference where the double-r is placed?

This worked for me for testing

        "init_script": "state.errores = []",
        "map_script": "if(doc['TIPO_ERROR.keyword'].value != null && doc['TIPO_ERROR.keyword'].value == 'foob') { state.errores.add(1) } else {state.errores.add(0)} ",

hope that helps.. in general, I'd try not to have a list, but maybe just two counters that are getting increased, so that you have less data being sent back and forth? Or even a single counter might be sufficient in this case... I guess it is simplifying