Reindex text field script manipulation

Hi,

We have been indexing some documents for some time now. Unfortunately one of the text fields is incorrect. However the correct value of the field does appear as a substring within the document.

On a basic level ignoring other fields the document looks like this.

{
  "field_id": <number>,
  "compositeField" : "<number><string value to extract>"
}

What I want to do is for every document reindex the composite field to a new field removing the initial part of the string which is always the same as "field_id"

I haven't tried using the reindex but I have tried some basic expressions using the dev tools in kibana.

GET index-pattern-*/_search
{
  "query": {
    "match_all": {}
  },
    "script_fields": {
    "new_field": {
      "script": {
        "source": "field('compositeField').get(null)"
      }
    }
  }
}

However even without doing any manipulation to the source field. it returns an error response.

          "script_stack" : [
            "org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:875)",
            "org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:112)",
            "org.elasticsearch.index.query.SearchExecutionContext.lambda$lookup$2(SearchExecutionContext.java:512)",
            "org.elasticsearch.search.lookup.SearchLookup.getForField(SearchLookup.java:109)",
            "org.elasticsearch.search.lookup.LeafDocLookup$2.run(LeafDocLookup.java:107)",
            "org.elasticsearch.search.lookup.LeafDocLookup$2.run(LeafDocLookup.java:104)",
            "java.base/java.security.AccessController.doPrivileged(AccessController.java:318)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:104)",
            "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:28)",
            "doc['compositeField']",
            "    ^---- HERE"
          ],
          "script" : "doc['compositeField']",
          "lang" : "painless",
          "position" : {
            "offset" : 4,
            "start" : 0,
            "end" : 22
          },
          "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [customer_EPG_id] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
          }
        }

Is there no way to read a string field in a script? I don't need to sort by the value I don't need to do aggregation. Just manipulate the value and store as a new field.

This will help you. The way to access fields are bit different between reindexing and runtime field.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.