Issue with grok pattern which doesn't work

Hello,

I need to create a grok into Elasticsearch/Kibana/Filebeat to parse my message field. I find myself with two issues :

Here is my pipeline :

 PUT /_ingest/pipeline/filebeat-7.6.2-system-syslog-pipeline
{
  "description": "Pipeline for parsing Syslog messages.",
  "processors": [
    {
      "grok": {
        "pattern_definitions": {
          "GREEDYMULTILINE": "(.|)*"
        },
        "ignore_missing": true,
        "field": "message",
        "patterns": [
        "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pid:long}\\])?: %{DATA:bgp.state}: %{DATA:bgp.protocol} %{IP:client.ip} \\(%{DATA:bgp.as}\\) %{GREEDYMULTILINE:system.syslog.message}"
        "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pi:long}\\])?: %{GREEDYMULTILINE:system.syslog.message}",
          "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{GREEDYMULTILINE:system.syslog.message}",
        "%{TIMESTAMP_ISO8601:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pd:long}\\])?: %{GREEDYMULTILINE:system.syslog.message}"
        "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname}_%{DATA:host.re} %{DATA:process.name}(?:\\[%{POSINT:process.pid:long}\\])?: %{DATA:bgp.state}: %{DATA:bgp.protocol} %{IP:client.ip} \\(%{DATA:bgp.as}\\) %{GREEDYMULTILINE:system.syslog.message}"
        "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pid:long}\\])?: %{DATA:bgp.state}: %{DATA:bgp.protocol} %{IP:client.ip} \\(%{DATA:bgp.as}\\): %{GREEDYMULTILINE:system.syslog.message}"


        ]
      }
    },
    {
      "rename": {
        "field": "system.syslog.message",
        "target_field": "message",
        "ignore_missing": true
      }
    },
    {
      "remove": {
        "field": "message"
      }
    },
    {
      "date": {
        "formats": [
          "MMM  d HH:mm:ss",
          "MMM dd HH:mm:ss",
          "MMM d HH:mm:ss",
          "ISO8601"
        ],
        "on_failure": [
          {
            "append": {
              "field": "error.message",
              "value": "{{ _ingest.on_failure_message }}"
            }
          }
        ],
        "if": "ctx?.event?.timezone == null",
        "field": "system.syslog.timestamp",
        "target_field": "@timestamp"
      }
    },
    {
      "date": {
        "if": "ctx?.event?.timezone != null",
        "field": "system.syslog.timestamp",
        "target_field": "@timestamp",
        "formats": [
          "MMM  d HH:mm:ss",
          "MMM dd HH:mm:ss",
          "MMM d HH:mm:ss",
          "ISO8601"
        ],
        "timezone": "{{ event.timezone }}",
        "on_failure": [
          {
            "append": {
              "field": "error.message",
              "value": "{{ _ingest.on_failure_message }}"
            }
          }
        ]
      }
    },
    {
      "remove": {
        "field": "system.syslog.timestamp"
      }
    }
  ],
  "on_failure": [
    {
      "set": {
        "field": "error.message",
        "value": "{{ _ingest.on_failure_message }}"
      }
    }
  ]
}
  • The first one is that I want to parse messages of the type (but not only that) :

PS : I replace the actual values by X for confidentiality.

"Feb 7 00:23:34 ig1-edge-dc3-01_re0 rpd[1524]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (External AS x) changed state from EstabSync to Established (event RsyncAck) (instance master)"

I tested it with the _simulate option, which gave me what I wanted but on the Kibana DIscover, it returns me Provided Grok expressions do not match field value: [Feb 14 14:55:11 ig1-edge-dc3-01_re0 rpd[1524]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (Internal AS x) changed state from Established to Idle (event RecvNotify) (instance master)]. I don't understand why.

  • The second one is that I would like to parse all of these messages :

"Feb 14 10:13:32 ig1-ar-dc3-02 fpc0 DCBCM [ge-0/0/26]: phy layer link status [2nd try mismatch] is FALSE"

"Feb 9 00:57:14 ig1-edge-th2-01 rpd[9169]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (External AS x) changed state from OpenConfirm to Established (event RecvKeepAlive) (instance master)"

"Feb 7 00:23:34 ig1-edge-dc3-01_re0 rpd[1524]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer x.x.x.x (External AS x) changed state from EstabSync to Established (event RsyncAck) (instance master)"

"Feb 10 22:40:26 ig1-ar-dc3-01 rpd[2121]: BGP_IO_ERROR_CLOSE_SESSION: BGP peer x.x.x.x (External AS x): Error event Operation timed out(60) for I/O session - closing it"

but it only apply the first pattern one the grok, which is actually "%{SYSLOGTIMESTAMP:system.syslog.timestamp} %{SYSLOGHOST:host.hostname} %{DATA:process.name}(?:\\[%{POSINT:process.pi:long}\\])?: %{GREEDYMULTILINE:system.syslog.message}" in my pipeline.
Were the other ones ignore or there is a limit to the number of patterns ?

Thanks for you time.

I really don't know how but the parse works for some messages but not all even if they have the same syntax.

Anyone might know why ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.