Multiple grok match don't work


#1

Hi, i have an ELK stack on docker and put logs througt "gelf", i need match logs by log format, i have two files (access.log and error.log, and one single tag "apache") i have this filter

filter {
    if [tag] == "apache" {
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
            add_field => [ "logtype", "apache-log" ]
        }
        grok {
            match => { "message" => "\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{NUMBER:pid}:tid %{NUMBER:tid}\] \[client %{IP:clientip}:.*\] %{DATA:errorcode}: %{GREEDYDATA:message}" }
            add_field => [ "logtype", "apache-error" ]
        }
    if "_grokparsefailure" in [tags] {
        drop {}
        }
    }
}

Only one grok works at a time, together no logs are available, if i comment first grok i have only errors if i comment second grok i have only access:

errors only:

filter {
    if [tag] == "apache" {
        # grok {
        #     match => { "message" => "%{COMBINEDAPACHELOG}" }
        #     add_field => [ "logtype", "apache-log" ]
        # }
        grok {
            match => { "message" => "\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{NUMBER:pid}:tid %{NUMBER:tid}\] \[client %{IP:clientip}:.*\] %{DATA:errorcode}: %{GREEDYDATA:message}" }
            add_field => [ "logtype", "apache-error" ]
        }
    if "_grokparsefailure" in [tags] {
        drop {}
        }
    }
}

access only:

filter {
    if [tag] == "apache" {
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
            add_field => [ "logtype", "apache-log" ]
        }
        # grok {
        #     match => { "message" => "\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{NUMBER:pid}:tid %{NUMBER:tid}\] \[client %{IP:clientip}:.*\] %{DATA:errorcode}: %{GREEDYDATA:message}" }
        #     add_field => [ "logtype", "apache-error" ]
        # }
    if "_grokparsefailure" in [tags] {
        drop {}
        }
    }
}

Is there a way to have them both?


(Pjanzen) #2

you could add the remove_tag option to the grok filter?

https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-remove_tag

this will trigger on success.


#3

Sorry but I do not understand how it can help me? an example?


(Pjanzen) #4

Something like this.

filter {
    if [tag] == "apache" {
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
            add_field => [ "logtype", "apache-log" ]
	    remove_tag => ['_grokparsefailure']
        }
        grok {
            match => { "message" => "\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{NUMBER:pid}:tid %{NUMBER:tid}\] \[client %{IP:clientip}:.*\] %{DATA:errorcode}: %{GREEDYDATA:message}" }
            add_field => [ "logtype", "apache-error" ]
	    remove_tag => ['_grokparsefailure']
        }
    	if "_grokparsefailure" in [tags] {
          drop {}
        }
    }
}

#5

In this way i have only error log...


(Pjanzen) #6

Then COMBINEDAPACHELOG does not match.

you can test your grok patterns here


#7

But if i comment the error.log grok, i have access logs correctly....i have already test with grokdebug and work


#8

If i invert order of grok i have only access log, like this

filter {
    if [tag] == "apache" {
        grok {
            match => { "message" => "\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{NUMBER:pid}:tid %{NUMBER:tid}\] \[client %{IP:clientip}:.*\] %{GREEDYDATA:message}" }
            add_field => [ "logtype", "apache-error" ]
        remove_tag => ["_grokparsefailure"]
        }
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
            add_field => [ "logtype", "apache-log" ]
        remove_tag => ["_grokparsefailure"]
        }
    if "_grokparsefailure" in [tags] {
        drop {}
        }
    }
}

(Magnus Bäck) #9

Why not set the logtype field already on the input side? Why do you need to use grok to figure out what kind of a log it is?

if [tag] == "apache" {

Do you really have a field named tag? Or did you mean "apache" in [tags]?

I suggest you disable the _grokparsefailure tag (using the tag_on_failure option) and change

if "_grokparsefailure" in [tags] {

into

if not [logtype] {

so that you delete events that haven't had the logtype field set, indicating that none of the grok filters matched.

But really, instead of dropping those events you should save them somewhere. How would you otherwise know if your grok filters are incorrectly failing to match some legitimate input events?


#10

I use gelf driver for docker, i can apply only "tag" or "label" (https://docs.docker.com/engine/admin/logging/overview/#gelf), and can't apply tag to separate input file...
i would like to install filebeat, but I have containers in alpine and I can not install it because in edge repository. I have (now) two types of file, apache access and apache errors, maybe you mean that way?

filter {
    if [tag] == "apache" {
        grok {
            match => { "message" => "\[(?<timestamp>%{DAY:day} %{MONTH:month} %{MONTHDAY} %{TIME} %{YEAR})\] \[%{WORD:module}:%{LOGLEVEL:loglevel}\] \[pid %{NUMBER:pid}:tid %{NUMBER:tid}\] \[client %{IP:clientip}:.*\] %{GREEDYDATA:message}" }
            add_field => [ "logtype", "apache-error" ]
        remove_tag => ["_grokparsefailure"]
        }
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
            add_field => [ "logtype", "apache-log" ]
        remove_tag => ["_grokparsefailure"]
        }
    if not [logtype] {
        drop {}
        }
    }
}

In any case the two patterns work perfectly, the problem is that one excludes the other


(Magnus Bäck) #11

I have (now) two types of file, apache access and apache errors, maybe you mean that way?

Yes, that looks reasonable except that I'd use tag_on_failure instead of remove_tag.

In any case the two patterns work perfectly, the problem is that one excludes the other

Not sure what you mean by this.


(system) #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.