Issue with escaping pipe "|" in Ingest Pipeline

Hi Guys,

I have a log patern that look like this
(windows event log)
Dummy request has been approved.|Node=blah-blah.example.net_iam|BatchSig=jhsdjahsdgsjahdg|Requester=john.test|Recipient=ex0000123456

below grok pattern seems to do the trick correctly

%{DATA:winlog.event_data.Message}\|%{GREEDYDATA:kvpairs}

EG:

However, in ingest pipeline. It does not let me save the pattern like that

If I change the rule to
%{DATA:winlog.event_data.Message}\\|%{GREEDYDATA:kvpairs}

which means with \\|, the JSON error goes away. but it seems the grok pattern is wrong with two \\

How can I get this resolved?

My current pipeline with wrong grok pattern is this

PUT _ingest/pipeline/winlogbeat-hitachi
{
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{DATA:winlog.event_data.Message}\\|%{GREEDYDATA:kvpairs}"
        ],
        "if": "ctx?.winlog?.channel ==~ /Example*/",
        "ignore_failure": true
      }
    },
    {
      "kv": {
        "field": "kvpairs",
        "field_split": "\\|",
        "value_split": "=",
        "target_field": "winlog.event_data",
        "ignore_missing": true,
        "ignore_failure": true
      }
    }
  ]
}

Requirement i have is this

Dummy request has been approved.|Node=blah-blah.example.net_iam|BatchSig=jhsdjahsdgsjahdg|Requester=john.test|Recipient=ex0000123456

From this above log message
I need to put them in fields as below.

message:Dummy request has been approved.
winlog.event_data.Node=blah-blah.example.net_iam
winlog.event_data.BatchSig=jhsdjahsdgsjahdg
winlog.event_data.Requester=john.test
winlog.event_data.Recipient=ex000012345

Have you tried without escaping it? I don't think there is any need to escape the | in an ingest pipeline or in logstash.

You also do not need grok, the dissect processor can do the same thing and use less CPU.

Try this:

{
    "dissect": {
        "field": "message",
        "pattern" : "%{winlog.event_data.Message}|%{kvpairs}",
        "if": "ctx?.winlog?.channel ==~ /Example*/",
        "ignore_failure": true
    }
}
1 Like

Just a drive by thought, dissect + KV processor will probably be more computationally efficient not to mention easier.

This works probably faster computationally than grok.
You can remove the uneccesary

POST /_ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "_description",
  "processors": [
    {
      "dissect": {
        "field": "raw_message",
        "pattern": "%{message}|%{key_values}"
      }
    },
    {
      "kv": {
        "field": "key_values",
        "field_split": "\\|",
        "value_split": "=",
        "target_field": "winlog.event_data"
      }
    }
  ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "raw_message" : "Dummy request has been approved.|Node=blah-blah.example.net_iam|BatchSig=jhsdjahsdgsjahdg|Requester=john.test|Recipient=ex0000123456"
      }
    }
  ]
}

Results

{
  "docs" : [
    {
      "doc" : {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "id",
        "_source" : {
          "winlog" : {
            "event_data" : {
              "BatchSig" : "jhsdjahsdgsjahdg",
              "Requester" : "john.test",
              "Recipient" : "ex0000123456",
              "Node" : "blah-blah.example.net_iam"
            }
          },
          "raw_message" : "Dummy request has been approved.|Node=blah-blah.example.net_iam|BatchSig=jhsdjahsdgsjahdg|Requester=john.test|Recipient=ex0000123456",
          "key_values" : "Node=blah-blah.example.net_iam|BatchSig=jhsdjahsdgsjahdg|Requester=john.test|Recipient=ex0000123456",
          "message" : "Dummy request has been approved."
        },
        "_ingest" : {
          "timestamp" : "2021-10-08T20:34:59.240685423Z"
        }
      }
    }
  ]
}
1 Like

Thanks a lot @leandrojmp
My condition is based on this field,

Do I have it incorrectly configured by doing this?

ctx?.winlog?.channel ==~ /example*/
??

Confirmed This works.

PUT _ingest/pipeline/winlogbeat-example
{
  "processors": [
    {
      "dissect": {
        "field": "message",
        "pattern": "%{message}|%{kvpairs}",
        "ignore_failure": true
      }
    },
    {
      "kv": {
        "field": "kvpairs",
        "field_split": "\\|",
        "value_split": "=",
        "target_field": "winlog.event_data",
        "ignore_missing": true,
        "ignore_failure": true
      }
    }
  ]
}

I have an issue with the Condition

In the actual documents, I have field values like this

winlog.channel:Example-Example ID Systems-Example ID Suite/Operational
winlog.channel:Example-Example ID Systems-Example ID Suite/Admin

And I need to apply above processor only for these events.
How can I do this?
Right now, what I have is this. Which is clearly not working
"if": "ctx?.winlog?.channel ==~ /Example*/",

I think that the ==~ is wrong, the operator is just =~.

But you also can try something like this:

"if": "ctx.winlog?.channel?.contains('Example')
2 Likes

Thanks a lot guys @leandrojmp @stephenb .
I have a working pipeline now. You rock :facepunch: :facepunch: :love_you_gesture: :love_you_gesture:

1 Like