Remove a Field using a wildcard on reindex

I've done a foolish thing and created a large number of fields in the format

 "fingerprint.attachment_5C6C2D4E000092B707F41EF1" : "Attachment 5C6C2D4E000092B707F41EF1.gif is an IMAGE and therfore NOT processed"

I used an underscore rather than a period

I'm trying to reindex to remove these fields using this pipeline but it's not working. Can anybody point out where I'm going wrong.

PUT _ingest/pipeline/remove-fingerprint-attachment-field
{
  "description" : "Remove Fingerprint Attachment Field pipeline",
  "processors" : [
     {
     "foreach" : {
        "field" : "fingerprint",
         "processor" : {
            "script": {
              "lang": "painless",
              "source": "if (ctx._ingest._value.contains(params.field_part)) {ctx._ingest_value.remove}",
              "params": {
                "params":
                  {"field_part":"attachment_"}
              }
            }
          }
        }
    }  
  ]
}

The error is

"cause": {
  "type": "exception",
  "reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fingerprint] not present as part of path [fingerprint]",
  "caused_by": {
    "type": "illegal_argument_exception",
    "reason": "java.lang.IllegalArgumentException: field [fingerprint] not present as part of path [fingerprint]",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "field [fingerprint] not present as part of path [fingerprint]"
    }
  },
  "header": {
    "processor_type": "foreach"
  }
}

foreach would iterate on an array, which the source data does not appear to match that.

You could likely do this with just a reindex script rather than needing to use an ingest pipeline. Below is a quick sample using the ingest simulate with an ingest script processor to add the values to a list (if there are multiple) and remove.

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "_description",
    "processors": [
      {
        "script": {
          "lang": "painless",
          "source": """
Iterator it = ctx.entrySet().iterator();
def tmp_list = new ArrayList();
while (it.hasNext()) {
  def k = it.next();
  if (k.getKey().contains("fingerprint.attachment_")) {
    tmp_list.add(k.getValue());
    it.remove();
  }
}
ctx.put("fingerprint.errors", tmp_list);
"""
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "fingerprint.attachment_5C6C2D4E000092B707F41EF1": "Attachment 5C6C2D4E000092B707F41EF1.gif is an IMAGE and therfore NOT processed",
        "fingerprint.attachment_5C6C2D4E000092B707F41EF2": "Attachment 5C6C2D4E000092B707F41EF2.gif is an IMAGE and therfore NOT processed",
        "test_key": 1,
        "test_object": {
          "test_key2": "key 2"
        },
        "test_array": [
          "value1",
          "value2"
        ]
      }
    }
  ]
}
1 Like

THANK YOU VERY MUCH!!

how do I find out where to script painless like that. I spent ages on the ES website but missed the Iterator part.

Is there an idiots guide anywhere?

James

1 Like

Personally, just look for java examples on the web, and pay close attention to the slight differences in syntax and the whitelisted java classes for painless.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.