Scripted fields: text field returns null

Hi, I have an index with a lot of fields. I am trying to create a scripted field that will do some string manipulation on another field, but when I try to access it - I keep getting null
What I've noticed is that if I try to access integers everything works perfectly, but when I try to access any text field, it returns null
For example, taking these fields into account:

if I do:
emit(doc['run_info.duration'].value.toString());

everything works great
but if I do:
emit(doc['run_info.test'].value.toString());
(with or without the toString()), it's empty. and that's true to all text vs integers.
I tried copying the object JSON from "discover" and use it in the Painless debugger, but it all worked there, so I have no clue where to go next
Any idea what I'm doing wrong?

No one? :frowning:

Hey @RonGros ,

Generally doc['yourfield'].value should be enough to get the value. Could you share the mappings for your index? It might help to better understand the issue.

Regards, DIma

Hi @Dzmitry, thanks for answering. so the value doesn't work for text fields for me.
Here is the mapping of this field:

"test": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }

We have a huge index so won't share all of it....

There are a couple of gotchas when working with text fields in painless.

Text fields are stored as a list of tokens in Lucene doc values, so this is what painless returns when you access them via doc. If that's what you want, then you'll need "fielddata": true for the text field. Without that setting, you will get an illegal_argument_exception when accessing a text field.

Typically, you'll want to use the keyword version of the field, as that will come back as the untokenized string.

However, if you have a long field and "ignore_above": 256, then the keyword field will not be available if the source data is longer than 256 characters.

Here's an example that demonstrates the difference between text and keyword fields.

PUT test
{
  "mappings": {
    "properties": {
      "test": {
        "type": "text",
        "fielddata": true,
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
  }
}

PUT test/_doc/1
{
  "test": "abc 123 do re mi"
}
GET test/_search
{
  "runtime_mappings": {
    "mytext": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['test'].value)"
      }
    },
    "mykeyword": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['test.keyword'].value)"
      }
    }
  },
  "fields": [
    "mytext", "mykeyword"
  ]
}

Returns

    "hits" : [
      {
        "_index" : "test",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "test" : "abc 123 do re mi"
        },
        "fields" : {
          "mytext" : [
            "123"
          ],
          "mykeyword" : [
            "abc 123 do re mi"
          ]
        }
      }
    ]

I can't think of a case where doc would return null.

It would be easier to help with a minimal reproduction of the issue.

Thanks @stu !
I believe that the "fielddata" is the issue here. did not try it out, because reading about it, it sounds like it's going to cause performance issues (memory wise), but it means I need to find another solution. at least now I know :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.