Opposite of "Append" processor?

Balu · March 10, 2024, 12:18pm

Hell guys,

what ingest processor can I use to remove an item from an array?

In my case I want to remove an "error tag" that was added to the "tags" field.

jessgarson · March 12, 2024, 12:30pm

Hi @Balu,

Would the remove processor be what you are looking for?

Hope this helps!

Jess

Balu · March 12, 2024, 1:17pm

The remove processor removes the whole field, not just an element in it?

In my example the whole "tags" field would be gone, not just the "error tag" which is one of multiple.

I am looking for the opposite processor to "Append".

PS: I just realized I've said "Hell" instead of "Hello" in my original post. - Sorry

jessgarson · March 12, 2024, 1:51pm

Thanks for your reply @Balu

No worries! I figured that was a typo.

You are correct. The remove processor removes all the fields. I'm not aware of a processor that is the opposite of append. Have you considered using a script processor to go through the array and remove the error tags?

Balu · March 12, 2024, 2:43pm

I have, but for me it's not as "painless" as I'd hope it to be . I need to learn the language more before I can do so.

dadoonet · March 12, 2024, 3:09pm

May be some inspiration could come from Removing elements from an array in a document

Balu · March 14, 2024, 9:48am

The scripting documentation has an example too.

So I tried this:

if (ctx._source.tags.contains('_grok_dovecot_nomatch')) { 
  ctx._source.tags.remove(ctx._source.tags.indexOf('_grok_dovecot_nomatch')) 
}

But I get a null pointer exception: cannot access method/field [tags] from a null def reference with a pointer to the _source element.

The document I am using is from the index and has _source. So I am not sure where that is coming from.

[
  {
    "_id": "o7U4PI4BdgQegvMfNBhs",
    "_index": ".ds-logs-logs-default-2024.03.04-000004",
    "_source": {
...
      "message": "imap-postlogin: user=abc, homedir=/..., rip=10.0.0.1, lip=127.0.0.1, arguments=/...",
      "tags": [
        "journald-log",
        "_grokparsefailure",
        "_grok_dovecot_nomatch"
      ],
...
      "@timestamp": "2024-03-14T09:08:15.503Z",
...
    }
  }
]

I had also tried the shorter version with the same result:

ctx._source.tags.remove('_grok_dovecot_nomatch')

dadoonet · March 14, 2024, 10:06am

Could you share a complete example of a _simulate call? So we can iterate from this? Like this example.

Balu · March 14, 2024, 10:54am

Sure.

POST /_ingest/pipeline/_simulate
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": [
            "%{IMAP_POSTLOGIN_WORD:dovecot.service}: user=%{DOVECOT_USER:dovecot.user}, homedir=%{DATA:dovecot.homedir}, rip=%{IP:dovecot.rip}, lip=%{IP:dovecot.lip}, arguments=%{DATA:dovecot.arguments},"
          ],
          "pattern_definitions": {
            "DOVECOT_USER": "%{USERNAME}|%{EMAILADDRESS}|%{DATA}",
            "IMAP_POSTLOGIN_WORD": "imap-postlogin"
          },
          "ignore_missing": true,
          "ignore_failure": true
        }
      },
      {
        "script": {
          "source": "if (ctx._source.tags.contains('_grokparsefailure')) { \n  ctx._source.tags.remove(ctx._source.tags.indexOf('_grokparsefailure')) \n}",
          "if": "ctx?.dovecot?.service == 'imap-postlogin'"
        }
      }      
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "message": "imap-postlogin: user=us@r, homedir=/.../, rip=10.0.0.1, lip=127.0.0.1, arguments=/.../,",
        "tags": [
          "journald-log",
          "_grokparsefailure"
        ]
      }
    }
  ]
}

dadoonet · March 14, 2024, 11:16am

Try:

POST /_ingest/pipeline/_simulate
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": [
            "%{IMAP_POSTLOGIN_WORD:dovecot.service}: user=%{DOVECOT_USER:dovecot.user}, homedir=%{DATA:dovecot.homedir}, rip=%{IP:dovecot.rip}, lip=%{IP:dovecot.lip}, arguments=%{DATA:dovecot.arguments},"
          ],
          "pattern_definitions": {
            "DOVECOT_USER": "%{USERNAME}|%{EMAILADDRESS}|%{DATA}",
            "IMAP_POSTLOGIN_WORD": "imap-postlogin"
          },
          "ignore_missing": true,
          "ignore_failure": true
        }
      },
      {
        "script": {
          "source": """
if (ctx.tags != null && ctx.tags.contains('_grokparsefailure')) { 
    ctx.tags.remove(ctx.tags.indexOf('_grokparsefailure'));
}""",
          "if": "ctx?.dovecot?.service == 'imap-postlogin'"
        }
      }      
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "message": "imap-postlogin: user=us@r, homedir=/.../, rip=10.0.0.1, lip=127.0.0.1, arguments=/.../,",
        "tags": [
          "journald-log",
          "_grokparsefailure"
        ]
      }
    }
  ]
}

Balu · March 14, 2024, 1:16pm

This seems to work. Thank you.

I do understand the extra check for ctx.tags != null, but I'm still confused when to use _source and when not.

The context in processors already is the _source document, but if I run a script somewhere else, it's not?

PS: An extra processor would make this easier though.

system · April 11, 2024, 1:16pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using the Remove processor for ingest node Elasticsearch	8	3942	July 5, 2017
Append processor overwrites, doesn't append Elasticsearch	3	1416	March 26, 2019
How to remove element value in array field Logstash	4	3040	March 29, 2018
Question elk Discussions en français	10	477	October 4, 2023
Cannot remove field with ingest processor Elasticsearch	4	727	December 26, 2020

Opposite of "Append" processor?

Related topics