Grok filter does not behave as I want it to


#1

Hello!

I am trying to set up some filters in Logstash, so that my Elastisearch log gets parsed and pretty in Kibana.

I have exported the pipeline-plain.json regarding Elasticsearch server log from Filebeat, and after some tinkering with the grok filter, I ended up with:

   filter {
      if [fileset][module] == "elasticsearch" {
          mutate {
            add_field => {
              "raw_message" => "%{message}"
            }
          }
        if [fileset][name] == "server" {
          grok {
            pattern_definitions => {
                "GREEDYMULTILINE" => "(.|\n)*"
                "INDEXNAME" => "[a-zA-Z0-9_.-]*"
            }
            match => { "message" => ["\[%{TIMESTAMP_ISO8601:elasticsearch.server.timestamp}\]\[%{LOGLEVEL:log.level}%{SPACE}?\]\[%{DATA:elasticsearch.server.component}%{SPACE}*\](%{SPACE}*)?(\[%{DATA:elasticsearch.node.name}\])?(%{SPACE}*)?(\[gc\](\[young\]\[%{NUMBER:elasticsearch.server.gc.young.one}\]\[%{NUMBER:elasticsearch.server.gc.young.two}\]|\[%{NUMBER:elasticsearch.server.gc_overhead}\]))?%{SPACE}*((\[%{INDEXNAME:elasticsearch.index.name}\]|\[%{INDEXNAME:elasticsearch.index.name}\/%{DATA:elasticsearch.index.id}\]))?%{SPACE}*%{GREEDYMULTILINE:message}"]}
          }
          date {
            match => [ "elasticsearch.server.timestamp", "ISO8601" ]
          }
        }
      }
    }

When this filter is applied to:

[2018-09-17T10:45:35,501][INFO ][o.e.x.s.a.s.FileRolesStore] [xTyQnIt] parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]

I end up with:

"message":["[2018-09-17T10:45:35,501][INFO ][o.e.x.s.a.s.FileRolesStore] [xTyQnIt] parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]","parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]"],

This message is an array consisting of, from what I can understand, my raw_message(the whole log) and message(which is all I want in message. I am not an expert at all, but I suspect %{GREEDYMULTILINE:message} does something under the hood which I am not aware of..

I have done the same thing to parse Logstash logs:

Filter:

filter {
  if [fileset][module] == "logstash" {
      mutate {
        add_field => {
          "raw_message" => "%{message}"
        }
      }
    if [fileset][name] == "log" {
      grok {
        pattern_definitions => {
            "LOGSTASH_CLASS_MODULE" => "[\w\.]+"
            "LOGSTASH_LOGLEVEL" => "(INFO|ERROR|DEBUG|FATAL|WARN|TRACE)"
        }
        match => { "message" => ["\[%{TIMESTAMP_ISO8601:logstash.log.timestamp}\]\[%{LOGSTASH_LOGLEVEL:logstash.log.level}\s*\]\[%{LOGSTASH_CLASS_MODULE:logstash.log.module}\s*\]\s*%{GREEDYDATA:logstash.log.message}"]}
      }
      mutate {
        replace => {
          "message" => "%{logstash.log.message}"
        }
      }
      date {
        match => [ "logstash.log.timestamp", "ISO8601" ]
      }
    }
  }
}

Applied to:

[2018-09-05T12:39:48,285][INFO ][logstash.inputs.metrics  ] Monitoring License OK

With result:

"message":"Monitoring License OK",

Since the filter for Elasticsearch does not utilize the same "json-logic", and sends data to message straight away, and not like log stash, logstash.log.message i don't do any replace-mutation.

I someone with a few minutes to spare could guide me as to what I could do, Id appreciate it immensely.


#2

My raw_message created my problem, so I sent GREEDYMULTILINE to something else that message, mutated this "else" into message and Im a happy grokker!


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.