Grok filter multiple match

Hello,

Can someone point me to a proper way with grok parsing? :slight_smile:

I want to parse syslog messages, for this purpose I created 2 syslog pattern within grok:

filter {
  if [type] == 'syslog' {
  	grok {
      match => { 'message' => ['%{TIMESTAMP_ISO8601:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}',
                '%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}'] }
      add_field => [ 'received_from', "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ]
      target => [ "@timestamp"]
    }

I need to have 2 patterns because some hosts sends log in ISO8601 format and the rest in traditional(I might b wrong here but I understand it like that).

According to LS documentation - it's correct code for grok:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-match

If you need to match multiple patterns against a single field, the value can be an array of patterns:

filter {
      grok {
        match => {
          "message" => [
            "Duration: %{NUMBER:duration}",
            "Speed: %{NUMBER:speed}"
          ]
        }
      }
    }

Is it possible to parse syslog messages using next construction:

filter {
  if [type] == 'syslog' {
  	grok {
      match => { 'message' => '%{TIMESTAMP_ISO8601:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}' }
      add_field => [ 'received_from', "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "ISO8601" ]
      target => [ "@timestamp"]
    }

filter {
  if [type] == 'syslog' {
  	grok {
      match => { 'message' => '%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}' }
      add_field => [ 'received_from', "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
      target => [ "@timestamp"]
    }
  1. Can I use more than 1 pattern within grok filter?
  2. If I will use more than 1 pattern - will it cause delays in log parsing or logs going to parse in wrong way(I will get _grokparsefailure for example)?
  3. What approach is best - first or second and why?
  4. What is the best way to parse logs using more than 1 pattern.

Thank you.

You can use more than one pattern in a filter. You should read this, which will help you understand why you should anchor your patterns using ^ if they are expected to match the start of a line. There is very little overhead using multiple patterns if you do that.

Thank you, reading now....
How about constructions above? What of them is more correct - when I used multiple matching or when I used grok filter twice against one type: syslog?

Personally I would use an array of patterns.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.