Remove a Field using a wildcard on reindex

hogbinj · May 23, 2019, 1:28pm

I've done a foolish thing and created a large number of fields in the format

 "fingerprint.attachment_5C6C2D4E000092B707F41EF1" : "Attachment 5C6C2D4E000092B707F41EF1.gif is an IMAGE and therfore NOT processed"

I used an underscore rather than a period

I'm trying to reindex to remove these fields using this pipeline but it's not working. Can anybody point out where I'm going wrong.

PUT _ingest/pipeline/remove-fingerprint-attachment-field
{
  "description" : "Remove Fingerprint Attachment Field pipeline",
  "processors" : [
     {
     "foreach" : {
        "field" : "fingerprint",
         "processor" : {
            "script": {
              "lang": "painless",
              "source": "if (ctx._ingest._value.contains(params.field_part)) {ctx._ingest_value.remove}",
              "params": {
                "params":
                  {"field_part":"attachment_"}
              }
            }
          }
        }
    }  
  ]
}

The error is

"cause": {
  "type": "exception",
  "reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fingerprint] not present as part of path [fingerprint]",
  "caused_by": {
    "type": "illegal_argument_exception",
    "reason": "java.lang.IllegalArgumentException: field [fingerprint] not present as part of path [fingerprint]",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "field [fingerprint] not present as part of path [fingerprint]"
    }
  },
  "header": {
    "processor_type": "foreach"
  }
}

jpcarey · May 23, 2019, 11:19pm

foreach would iterate on an array, which the source data does not appear to match that.

You could likely do this with just a reindex script rather than needing to use an ingest pipeline. Below is a quick sample using the ingest simulate with an ingest script processor to add the values to a list (if there are multiple) and remove.

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "_description",
    "processors": [
      {
        "script": {
          "lang": "painless",
          "source": """
Iterator it = ctx.entrySet().iterator();
def tmp_list = new ArrayList();
while (it.hasNext()) {
  def k = it.next();
  if (k.getKey().contains("fingerprint.attachment_")) {
    tmp_list.add(k.getValue());
    it.remove();
  }
}
ctx.put("fingerprint.errors", tmp_list);
"""
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "fingerprint.attachment_5C6C2D4E000092B707F41EF1": "Attachment 5C6C2D4E000092B707F41EF1.gif is an IMAGE and therfore NOT processed",
        "fingerprint.attachment_5C6C2D4E000092B707F41EF2": "Attachment 5C6C2D4E000092B707F41EF2.gif is an IMAGE and therfore NOT processed",
        "test_key": 1,
        "test_object": {
          "test_key2": "key 2"
        },
        "test_array": [
          "value1",
          "value2"
        ]
      }
    }
  ]
}

hogbinj · May 24, 2019, 8:03am

THANK YOU VERY MUCH!!

how do I find out where to script painless like that. I spent ages on the ES website but missed the Iterator part.

Is there an idiots guide anywhere?

James

jpcarey · May 26, 2019, 6:37pm

Personally, just look for java examples on the web, and pay close attention to the slight differences in syntax and the whitelisted java classes for painless.

system · June 23, 2019, 6:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Remove pipeline using field wildcard Elasticsearch	7	1372	March 11, 2019
REINDEX : identify fields to remove with a regular expression? Elasticsearch reindex	9	784	March 1, 2022
Cannot remove field with ingest processor Elasticsearch	4	727	December 26, 2020
How to delete a field from an index while reindexing? Elasticsearch	7	13407	December 13, 2018
Removing a field with dots in an ingest pipeline Elasticsearch ingest-pipeline	4	1794	June 8, 2021

Remove a Field using a wildcard on reindex

Related topics