Problem with GROK when coming from filebeat (bracket parsing)

Hello all,

I'm sending some Filebeat logs to an ElasticSearch pipeline to transform and store on ElasticSearch.

The input event looks like:

Thu Jul 28 13:57:44 2022 us=22598 1.2.3.4:37491 [my_user] Peer Connection Initiated with [AF_INET]1.2.3.4:37491

And the GROK expression (works on Kibana debugger but not in the pipeline) is:

%{GREEDYDATA:dayweek} %{GREEDYDATA:month} %{GREEDYDATA:daymonth} %{GREEDYDATA:time} %{GREEDYDATA:year} us=%{WORD} %{IPORHOST:ip}:%{POSINT:puerto} \[%{GREEDYDATA:user}\] Peer Connection Initiated with %{GREEDYDATA:rest}

It works on Kibana Grok Debugger but not in the pipeline.

When creating the pipeline I have to remove the [ and ] characters from the pattern, because if not the pattern is not valid. And the error I get is:

... object mapping for [user] tried to parse field [user] as object, but found a concrete value"}, dropping event!","service.name":"filebeat","ecs.version":"1.6.0"}

How can I remove the brackets from the user and store the data on ElasticSearch? I tried with adding slash before without, but does not work in the pipeline.

Any help would be really appreciated.

Thanks :slight_smile:

user Is it defined field and it is an complex object see here.
You are trying to write a simple field into an object again, that's what the error means

Try this instead
user.name

Also, using GREEDYDATA for every field is not efficient.

You should look at predefined patterns.

If you share your whole pipeline we might be able to help

Try This

PUT _ingest/pipeline/discuss-test
{
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["""%{DAY:dayweek} %{MONTH:month} %{MONTHDAY:daymonth} %{TIME:time} %{GREEDYDATA:year} us=%{WORD:code1} %{IPORHOST:ip}:%{POSINT:puerto} \[%{USER:user.name}\] Peer Connection Initiated with \[%{WORD:code2}\]%{IPORHOST:other_ip}:%{POSINT:other_puerto}"""]
      }
    }
  ]
}

POST _ingest/pipeline/discuss-test/_simulate
{
  "docs": [
    {
      "_source": {
        "message": "Thu Jul 28 13:57:44 2022 us=22598 1.2.3.4:37491 [my_user] Peer Connection Initiated with [AF_INET]1.2.3.4:37491"
      }
    }
  ]
}

Result

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_id" : "_id",
        "_source" : {
          "code2" : "AF_INET",
          "code1" : "22598",
          "daymonth" : "28",
          "dayweek" : "Thu",
          "year" : "2022",
          "ip" : "1.2.3.4",
          "other_ip" : "1.2.3.4",
          "message" : "Thu Jul 28 13:57:44 2022 us=22598 1.2.3.4:37491 [my_user] Peer Connection Initiated with [AF_INET]1.2.3.4:37491",
          "other_puerto" : "37491",
          "puerto" : "37491",
          "month" : "Jul",
          "time" : "13:57:44",
          "user" : {
            "name" : "my_user"
          }
        },
        "_ingest" : {
          "timestamp" : "2022-07-28T23:14:38.476496Z"
        }
      }
    }
  ]
}

You should probably create a mapping too!

2 Likes

Stephen, thank you so much!!

The trick in the patern with """ solves the issue for me. Now it works succesfully.

Also thanks for your advises regardign not using greedydata always. It was for testing purpose.

Really apreciate your help.

Best regards,
Daniel.

1 Like