Working with strings in Painless

Hi,

I'm trying to manipulate strings doing an aggregation, but I fail. I've got a field mapped as a keyword, let's call it my_field. There are some data like 12.0343.13, 3.253.88 in this field in docs. I try to do the aggregation to find the position of the first dot. My code is

{
  "query": {
    "bool": {
      "filter": [],
      "should":[],
      "must_not": []
    }
  },
    "aggs": {
        "parameter": {
            "terms": {
                "script" : {
                    "inline": "doc['my_field'].value.indexOf('.')",
                    "lang": "painless"
                }
            }
        }
    },
    "size": 0
}

but instead of a nice aggregation with values like 1, 2, 3 I've got an error

{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [

        ],
        "script": "doc['my_field'].value.indexOf('.')",
        "lang": "painless"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "test_index",
        "node": "cXuJC5fxR3GsoK6qHMu2bQ",
        "reason": {
          "type": "script_exception",
          "reason": "runtime error",
          "script_stack": [

          ],
          "script": "doc['my_field'].value.indexOf('.')",
          "lang": "painless",
          "caused_by": {
            "type": "null_pointer_exception",
            "reason": null
          }
        }
      }
    ]
  },
  "status": 500
}

My ES version

"version" : {
    "number" : "5.5.2",
    "build_hash" : "b2f0c09",
    "build_date" : "2017-08-14T12:33:14.154Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.0"
  }

I cannot use any of the String functions from Painless API Reference - String. I think I miss the idea of using this functions, but I don't know what I'm doing wrong.

Best regards

Hi Dawid,

If I had to take a guess, it's that some values of 'my_field' are null. If you modify your script to be the following:

"if (doc['my_field'].value != null) return doc['my_field'].value.indexOf('.'); else return -1;"

does it work?

Hi Jack,

yes, it works, thank you. But still it is very unintuitive, because this

doc['my_value'].value

works like a charm without handling the null values.

doc['my_value'] returns an accessor for the field, but it doesn't actually try to access the value until you call .value.

Hi Dawid,

Out of curiosity, what would you personally expect the behaviour to be here? I only ask because I would like to improve the documentation around this if possible.

Thanks!

@rjernst running the code from my previous post as a script works well and there I access the value calling .value for every doc, where sometimes are no data

@Jack_Conradson I don't know. I think the documentation is OK, but the behavior is strange. Using doc['my_field'].value is ok, but when I add the method (indexOf(String)) to this code it doesn't work. Why there is no exception for only accessing the .value if there is no value?

Anyway thank you guys for the discussion.

Edit: @Jack_Conradson I know, you can add an information to the description of Script Aggregation that nulls are handled directly by aggregation and it is ok to return null, but in any other case you have to carry about them.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.