Multiple grok filter


(Rodolphe Redouté) #1

Hi,

I have a log file that can contains these 2 type of line :
> 2017-06-28 14:23:04 W Failed to send file to URL [http://some.URL.tv/server/some_tag.m3u8] after [2] attempts, retrying.
> 2017-06-27 10:40:34 I Dmx[24879.1] MPTS PCR wrapped from [95443.694] to [0.011]

i made 2 grok that match them so here is my filter :

filter {
  if [type] == "elemental_live" {
    grok {
      match => {"message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{GREEDYDATA:message}"},
      match => {"message" => "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{CISCO_REASON:key}%{NOTSPACE:URL}%{SPACE}%{GREEDYDATA:supplement}"}
    }
    if "_grokparsefailure" in [tags] {
      drop {}
    }
    date {
      match => [ "timestamp", "YYYY-MM-dd HH:mm:ss" ]
    }
    mutate {
      remove_field => ["host"]
      remove_field => ["[beat]"]
    }
  }
}

The thing is, if I try to get only 1 line with a grok the filter work, but when i put the 2 grok filter, i have an error so i suppose that my error is a syntax error, but i can't find the problem
here is the error :

> [2017-09-04T11:54:25,405][ERROR][logstash.agent ] Cannot create pipeline {:reason=>"Expected one of #, } at line 59, column 154 (byte 1472) after filter {\n if [type] == \"elemental_live\" {\n grok {\n match => {\"m essage\" => \"\\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{ GREEDYDATA:message}\"}"}

Does anyone has an idea please ?


(Magnus Bäck) #2

The comma at the end of the first matchline (%{GREEDYDATA:message}"},) shouldn't be there.

While this syntax for multiple grok expressions might otherwise work, the documentation describes a slightly different syntax.


(Rodolphe Redouté) #3

ok, the problem seems to be solved without the coma.

can i have a link to the doc for multiple grok expression ? i'm on Logstash 5.5 (if needed)


(Christian Stockhaus) #4

here is the requested link to the docs
https://www.elastic.co/guide/en/logstash/5.5/plugins-filters-grok.html#plugins-filters-grok-match


(Rodolphe Redouté) #5

Thank you for the link,

i changed my grok to :
grok {
match => {"message" => [ "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{GREEDYDATA:message}", "\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{CISCO_REASON:key}%{NOTSPACE:URL}%{SPACE}%{GREEDYDATA:supplement}" ] }
}

but it appear that only 1 of the pattern is used (the first one), even after i restarted logstash service.

I'll investigate on this.

Thanks for the help.


(Christian Stockhaus) #6

i think you must switch the two patterns because grok will only parse the first pattern if it matches your entry
and CISCO_REASON seems me more specififc than WORD


(Rodolphe Redouté) #7

hum, if i understand right, if the first pattern match grok will only match this pattern for the next lines ? or for every line grok try both (or the first one he match ?)

because this pattern
\A%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{WORD:LogLevel}%{SPACE}%{WORD:key}%{NOTSPACE:trash}%{SPACE}%{GREEDYDATA:message}
is used to match this type of line only
2017-06-27 10:40:34 I Dmx[24879.1] MPTS PCR wrapped from [95443.694] to [0.011]

and the other pattern match the other line only.

I need both pattern to be used to match the 2 types of loglines

so did I misunderstood and I'm not using the multiple grok as I should ?


(Rodolphe Redouté) #8

To your recommendation Shaoranlaos, i switched the pattern and note that with a little change, i could match the 2 line with only 1 grok,
but if i could have an answer to two more things :

  1. i changed my filter and restarted logstash, but no more line in kibana (the newly matched line doesn't appear). I tried to restart the full ELK solution, and also refreshed the index pattern in kibana but still no change.

  2. i'm still interested with how the multiple grok pattern should be (if i have to use it in the futur)

Thx in advance.


(Christian Stockhaus) #9

to 2.: grok will try to match the configured patterns for all lines in the order they are defined and exits with the first match (there is an option that will tell the filter to try all patterns but it is not needed in your case)

to 1.: i cann't say why it there are no entries in kibana. Have you configured the correct timespan? is there an error message in the logs of logstash/elasticsearch?


(Christian Stockhaus) #10

i have tried your given example lines and it seems as if the order in which you had them defined(with the CISCO_RESAON as second pattern) was the correct one beacause the CISCO_REASON will match both lines


(Rodolphe Redouté) #11

to 2 : ok i think I understand how it works now.

To 1 : in fact there are entries (only the one that matched my first grok), and to be sure i checked with "last 6 months"
i already had to make a filter change like that but i only had to do a restart of logstash and a refresh field in kibana index pattern, but now it doesn't work.
log just say that the start was done correctly.

yeah i figured it out too after you gave me the advise.


(Rodolphe Redouté) #12

it feels like either filebeat is no more pushing the logs files to logstash, or logstash doesn't apply his filter before he send it to elasticsearch or elasticsearch can't get the files
but absolutely no idea where the prob is.

I suppressed the ancient file directly in elasticsearch and try to make them reloaded, but the file don't appear anymore (even if i restarted FB, LS, ES and kibana)


(Christian Stockhaus) #13

you can rule out the filebeat if it writes the metrics in its log and in them there it says that it has this entry with a number > 0: libbeat.logstash.published_and_acked_event

you could activate debug in logstash to see if it processes events (setting log.level: trace in the config file)


(Rodolphe Redouté) #14

oh well, the problem solved itself this night, but 1 thing is particularly surprising :
the log file of today (with the errors of yesterday) has been uploaded this night in logstash, but, the previous files didn't update with the new filter.

is there like a "flag" when a logfile has been treated by logstash ? and if that's the case, is there a way to configure it to allow theses files to be re-uploaded with a new filter ?


(Christian Stockhaus) #15

than it could have been a mapping problem in es if you have daily indices

if you use filebeat for this there is a registry file(standard is /var/lib/filebeat/registry) in which it saves all the files and the position to which it has read after the ack from logstash

you could move the files that should be reuploaded from the original directory (so that they are no longer in the dir that is watched) and wait a bit for filebeat to register this (delete the entry from the registry file) and than move them again in this directory. If you haven't configured an ignore_older it should try to upload these files as if they were new files.


(Rodolphe Redouté) #16

yes the problem was the mapping, it works now !

Thank you !


(system) #17

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.