Parsing a varying length line with custom grok filter, only returns the first field

Hello,

I'm trying to set up a custom grok filter for my data input, but when I test it in Kibana's Grok Debugger, I only get the value for the first field (field1). I'm using grok instead of the csv parser because after field7 the last data field is varying length, and it should just be a single entry in Logstash (I will do some post processing on it afterwards).

My data looks like this:

91877900$|$11613428$|$DEVICE$|$CUSTOM-DEVICE1$|$UTC+02:00$|$["13","19","24","53","60","61","65","66","67","8","1"]$|$title=News$|$genre=News Broadcast$|$startTime=1574190000000$|$programId=659107083$|

My grok pattern looks like this:

%{INT:field1}\$|$ %{INT:field2}\$|$ %{WORD:field3}\$|$ %{DATA:field4}\$|$ %{DATA:field5}\$|$ %{TZ:field6}\$|$ %{GREEDYDATA:field7}\$|$ %{GREEDYDATA:theRestOfIt}

Can someone help with this? I'm getting stuck on this part, and I don't understand why my output looks like this:

{
  "field1": "91877900"
}

You need to escape all of the $ and | with \

| is used for alternation -- foo|bar matches either foo or bar, so your pattern match any one of

%{INT:field1}\$
$ %{INT:field2}\$
$ %{WORD:field3}\$$
etc.

So once if matches the first INT it does not check the rest of the patterns.

Thank you for the quick reply, Badger.

I tried escaping the $ and | with \, but if I run
%{INT:field1}\$ \|\$%{INT:field2}\$
or
%{INT:field1}\$ \$%{INT:field2}\$
on my input, I get a "Provided Grok patterns do not match data in the input" error.

I also get the same error if I try:
%{INT:field1}\$\|\$ %{INT:field2}\$\|\$ %{WORD:field3}\$\|\$ %{DATA:field4}\$\|\$ %{DATA:field5}\$\|\$ %{TZ:field6}\$\|\$ %{GREEDYDATA:field7}\$\|\$ %{GREEDYDATA:theRestOfIt}

Am I missing something?

Remove all the spaces and replace TZ with DATA.

input { generator { count => 1 lines => [ '91877900$|$11613428$|$DEVICE$|$CUSTOM-DEVICE1$|$UTC+02:00$|$["13","19","24","53","60","61","65","66","67","8","1"]$|$title=News$|$genre=News Broadcast$|$startTime=1574190000000$|$programId=659107083$|' ] } }
filter {
    grok { match => { "message" => "%{INT:field1}\$\|\$%{INT:field2}\$\|\$%{WORD:field3}\$\|\$%{DATA:field4}\$\|\$%{DATA:field5}\$\|\$%{DATA:field6}\$\|\$%{GREEDYDATA:field7}\$\|\$%{GREEDYDATA:theRestOfIt}" } }
}
output { stdout { codec => rubydebug { metadata => false } } }

produces

     "field6" => "[\"13\",\"19\",\"24\",\"53\",\"60\",\"61\",\"65\",\"66\",\"67\",\"8\",\"1\"]",
     "field1" => "91877900",
"theRestOfIt" => "programId=659107083$|",
     "field7" => "title=News$|$genre=News Broadcast$|$startTime=1574190000000",

etc.

Thanks a lot for the help, that solved it!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.