How can i use grok to match a single character of word

hi,
I have a string like SAMPLE DATA 08X71 +F(9)L.

i can match each word with %{WORD} like this,

%{WORD:1}\s*%{WORD:2}\s*%{WORD}

but how can I match each character separately from the last word in the same grok pattern? I want to create separate field for +, F and L

grok { match => { "message" => "%{WORD:1}\s*%{WORD:2}\s*%{WORD}\s(?<g1>.)(?<g2>.)\(%{WORD}\)(?<g3>.)" } }

Thank you for this. Would you be able to explain \s(?<g1>.) this part please? i couldn't find \s on grok default patterns, i can see SPACE and NOTSPACE though.

Take a look at the section of the grok filter documentation called "Custom Patterns". That defines a capture group that creates a field on the event called g1, which consists of a single character from the message (which is what . matches). The \s means whitespace, which you were already using.

You might want to try running logstash on the command line with this configuration

input { generator { message => 'SAMPLE DATA 08X71 +F(9)L' count => 1 } }
output { stdout { codec => rubydebug } }
filter {
  grok { match => { "message" => "%{WORD:1}\s*%{WORD:2}\s*%{WORD}\s(?<g1>.)(?<g2>.)\(%{WORD}\)(?<g3>.)" } }
}

and you should see the following plus the standard fields.

            "g3" => "L",
       "message" => "SAMPLE DATA 08X71 +F(9)L",
            "g2" => "F",
             "1" => "SAMPLE",
             "2" => "DATA",
            "g1" => "+"

I find this really useful for verifying patterns, although having to restart logstash for every tweak is painfully expensive.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.