Accessing body of a text field in painless script in the filter context

Rouzbeh_Farahmand · April 23, 2020, 3:11pm

Hello!
I have constructed a elastic search query with a filter and in the filter context and I am writing a painless script to filter some documents based on the body of the text field. However, when I want to access the text field, I get a list of terms instead of the original text. I am looking for a way to access the original text body in the painless script instead of a list of terms. Alternatively, I would like to access the term frequency vector of the document in this context if access to the body of the text is not possible.

For instance if I run this query:

```
GET twitter/_search
{
  "query": {
      "bool": { 
      "must":{
        "term" : { "body" : "spark" }
      },
      "filter": [
        {
        "script" : {
                    "script" : {
                        "lang": "painless",
                        "source": """
                          String text = doc['body'].toString();
                          Debug.explain(text);
                         return true;
                      """

                    }
                }
      }
      ]
      } 

    }
}
```

I get this response :

```
"took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 4,
    "skipped" : 0,
    "failed" : 1,
    "failures" : [
      {
        "shard" : 2,
        "index" : "twitter",
        "node" : "AClIunrSRUKb1gbhBz-JoQ",
        "reason" : {
          "type" : "script_exception",
          "reason" : "runtime error",
          "painless_class" : "java.lang.String",
          "to_string" : "[and, by, cutting, doug, hadoop, jack, jim, lucene, made, spark, the, was]",
          "java_class" : "java.lang.String",
          "script_stack" : [
            "Debug.explain(text);\n                         ",
            "              ^---- HERE"
          ],
          "script" : """
                          String text = doc['body'].toString();
                          Debug.explain(text);
                         return true;
                      """,
          "lang" : "painless",
          "caused_by" : {
            "type" : "painless_explain_error",
            "reason" : null
          }
        }
      }
    ]
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}
```

As you can see the debug shows that the doc['body'].toString() is in fact a list of terms [and, by, cutting, doug, hadoop, jack, jim, lucene, made, spark, the, was] . What I would like to have is to access to the original text which in this example is "body" : "The Lucene was made by Doug Cutting and the hadoop was made by Jim and Spark was made by jack"

NOTE: I have set the "fielddata": true and "store":true on this field and also indexed the document in a body.exact field so that terms wont get analyzed but nevertheless my problem is that I can't access the original text in the script in the filter context and I always get the list of unique terms.

Many thanks for your help!

system · May 21, 2020, 3:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Painless scripts don't work the same in filter and field contexts Elasticsearch	1	443	December 19, 2019
Accessing query from within painless script? Elasticsearch	3	572	May 11, 2017
How to access _id from Painless in Query context? Elasticsearch	3	3545	December 4, 2018
Filter based on String variable using Script Query Elasticsearch	1	515	March 14, 2018
Painless QA query Elasticsearch	2	470	November 28, 2018

Accessing body of a text field in painless script in the filter context

Related topics