Multiple grok filter

Rodolphe_Redoute · September 4, 2017, 11:11am

Hi,

I have a log file that can contains these 2 type of line :
> 2017-06-28 14:23:04 W Failed to send file to URL [http://some.URL.tv/server/some_tag.m3u8] after [2] attempts, retrying.
> 2017-06-27 10:40:34 I Dmx[24879.1] MPTS PCR wrapped from [95443.694] to [0.011]

i made 2 grok that match them so here is my filter :

filter {
  if [type] == "elemental_live" {
    grok {
      match => {"message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{GREEDYDATA:message}"},
      match => {"message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{CISCO_REASON:key}%{NOTSPACE:URL}%{SPACE}%{GREEDYDATA:supplement}"}
    }
    if "_grokparsefailure" in [tags] {
      drop {}
    }
    date {
      match => [ "timestamp", "YYYY-MM-dd HH:mm:ss" ]
    }
    mutate {
      remove_field => ["host"]
      remove_field => ["[beat]"]
    }
  }
}

The thing is, if I try to get only 1 line with a grok the filter work, but when i put the 2 grok filter, i have an error so i suppose that my error is a syntax error, but i can't find the problem
here is the error :

> [2017-09-04T11:54:25,405][ERROR][logstash.agent ] Cannot create pipeline {:reason=>"Expected one of #, } at line 59, column 154 (byte 1472) after filter {\n if [type] == \"elemental_live\" {\n grok {\n match => {\"m essage\" => \"\\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{ GREEDYDATA:message}\"}"}

Does anyone has an idea please ?

magnusbaeck · September 4, 2017, 11:21am

The comma at the end of the first matchline (%{GREEDYDATA:message}"},) shouldn't be there.

While this syntax for multiple grok expressions might otherwise work, the documentation describes a slightly different syntax.

Rodolphe_Redoute · September 4, 2017, 11:27am

ok, the problem seems to be solved without the coma.

can i have a link to the doc for multiple grok expression ? i'm on Logstash 5.5 (if needed)

Shaoranlaos · September 4, 2017, 11:41am

here is the requested link to the docs
https://www.elastic.co/guide/en/logstash/5.5/plugins-filters-grok.html#plugins-filters-grok-match

Rodolphe_Redoute · September 4, 2017, 11:52am

Thank you for the link,

i changed my grok to :
grok {
match => {"message" => [ "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{GREEDYDATA:message}", "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{CISCO_REASON:key}%{NOTSPACE:URL}%{SPACE}%{GREEDYDATA:supplement}" ] }
}

but it appear that only 1 of the pattern is used (the first one), even after i restarted logstash service.

I'll investigate on this.

Thanks for the help.

Shaoranlaos · September 4, 2017, 12:23pm

i think you must switch the two patterns because grok will only parse the first pattern if it matches your entry
and CISCO_REASON seems me more specififc than WORD

Rodolphe_Redoute · September 4, 2017, 12:31pm

hum, if i understand right, if the first pattern match grok will only match this pattern for the next lines ? or for every line grok try both (or the first one he match ?)

because this pattern
\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{GREEDYDATA:message}
is used to match this type of line only
2017-06-27 10:40:34 I Dmx[24879.1] MPTS PCR wrapped from [95443.694] to [0.011]

and the other pattern match the other line only.

I need both pattern to be used to match the 2 types of loglines

so did I misunderstood and I'm not using the multiple grok as I should ?

Rodolphe_Redoute · September 4, 2017, 1:02pm

To your recommendation Shaoranlaos, i switched the pattern and note that with a little change, i could match the 2 line with only 1 grok,
but if i could have an answer to two more things :

i changed my filter and restarted logstash, but no more line in kibana (the newly matched line doesn't appear). I tried to restart the full ELK solution, and also refreshed the index pattern in kibana but still no change.
i'm still interested with how the multiple grok pattern should be (if i have to use it in the futur)

Thx in advance.

Shaoranlaos · September 4, 2017, 1:08pm

to 2.: grok will try to match the configured patterns for all lines in the order they are defined and exits with the first match (there is an option that will tell the filter to try all patterns but it is not needed in your case)

to 1.: i cann't say why it there are no entries in kibana. Have you configured the correct timespan? is there an error message in the logs of logstash/elasticsearch?

Shaoranlaos · September 4, 2017, 1:21pm

i have tried your given example lines and it seems as if the order in which you had them defined(with the CISCO_RESAON as second pattern) was the correct one beacause the CISCO_REASON will match both lines

Rodolphe_Redoute · September 4, 2017, 1:26pm

to 2 : ok i think I understand how it works now.

To 1 : in fact there are entries (only the one that matched my first grok), and to be sure i checked with "last 6 months"
i already had to make a filter change like that but i only had to do a restart of logstash and a refresh field in kibana index pattern, but now it doesn't work.
log just say that the start was done correctly.

yeah i figured it out too after you gave me the advise.

Rodolphe_Redoute · September 4, 2017, 1:43pm

it feels like either filebeat is no more pushing the logs files to logstash, or logstash doesn't apply his filter before he send it to elasticsearch or elasticsearch can't get the files
but absolutely no idea where the prob is.

I suppressed the ancient file directly in elasticsearch and try to make them reloaded, but the file don't appear anymore (even if i restarted FB, LS, ES and kibana)

Shaoranlaos · September 5, 2017, 5:23am

you can rule out the filebeat if it writes the metrics in its log and in them there it says that it has this entry with a number > 0: libbeat.logstash.published_and_acked_event

you could activate debug in logstash to see if it processes events (setting log.level: trace in the config file)

Rodolphe_Redoute · September 5, 2017, 6:51am

oh well, the problem solved itself this night, but 1 thing is particularly surprising :
the log file of today (with the errors of yesterday) has been uploaded this night in logstash, but, the previous files didn't update with the new filter.

is there like a "flag" when a logfile has been treated by logstash ? and if that's the case, is there a way to configure it to allow theses files to be re-uploaded with a new filter ?

Shaoranlaos · September 5, 2017, 7:01am

than it could have been a mapping problem in es if you have daily indices

if you use filebeat for this there is a registry file(standard is /var/lib/filebeat/registry) in which it saves all the files and the position to which it has read after the ack from logstash

you could move the files that should be reuploaded from the original directory (so that they are no longer in the dir that is watched) and wait a bit for filebeat to register this (delete the entry from the registry file) and than move them again in this directory. If you haven't configured an ignore_older it should try to upload these files as if they were new files.

Rodolphe_Redoute · September 5, 2017, 8:57am

yes the problem was the mapping, it works now !

Thank you !

system · October 3, 2017, 8:57am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Grok filter 2 different logs Logstash	6	1341	July 6, 2017
How to use multiple filters and multiple Grok filters Logstash	5	14566	May 10, 2018
Error in multiple match in logstash filter Logstash	3	755	March 14, 2018
Multiple patterns in grok filter Logstash	4	3388	November 16, 2018
Logstash 8.1 multiple patterns Logstash	2	167	December 22, 2023

Multiple grok filter

Related topics