How can i use grok to match a single character of word


(RegeX) #1

hi,
I have a string like SAMPLE DATA 08X71 +F(9)L.

i can match each word with %{WORD} like this,

%{WORD:1}\s*%{WORD:2}\s*%{WORD}

but how can I match each character separately from the last word in the same grok pattern? I want to create separate field for +, F and L


#2
grok { match => { "message" => "%{WORD:1}\s*%{WORD:2}\s*%{WORD}\s(?<g1>.)(?<g2>.)\(%{WORD}\)(?<g3>.)" } }

(RegeX) #3

Thank you for this. Would you be able to explain \s(?<g1>.) this part please? i couldn't find \s on grok default patterns, i can see SPACE and NOTSPACE though.


#4

Take a look at the section of the grok filter documentation called "Custom Patterns". That defines a capture group that creates a field on the event called g1, which consists of a single character from the message (which is what . matches). The \s means whitespace, which you were already using.

You might want to try running logstash on the command line with this configuration

input { generator { message => 'SAMPLE DATA 08X71 +F(9)L' count => 1 } }
output { stdout { codec => rubydebug } }
filter {
  grok { match => { "message" => "%{WORD:1}\s*%{WORD:2}\s*%{WORD}\s(?<g1>.)(?<g2>.)\(%{WORD}\)(?<g3>.)" } }
}

and you should see the following plus the standard fields.

            "g3" => "L",
       "message" => "SAMPLE DATA 08X71 +F(9)L",
            "g2" => "F",
             "1" => "SAMPLE",
             "2" => "DATA",
            "g1" => "+"

I find this really useful for verifying patterns, although having to restart logstash for every tweak is painfully expensive.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.