I want to extract from request (extracted in match from message field) another two fields using REGEX.
I.E. request: /im/56/d9/17/z25006678F,Barka-hotelowa-na-Wisle.jpg
I want:
id: 25006678
Format: F
I tried something like that:
grok {
match => { "message" => '%{IP:ip}\s(%{GREEDYDATA:logname})?\s(%{GREEDYDATA:remote_user})?\s\[%{HTTPDATE:timestamp}\]\s\"%{WORD:metoda}\s%{NOTSPACE:request}\s(HTTP/%{NUMBER:httpversion}\")?\s%{NUMBER:response}\s%{GREEDYDATA:bytes}\s(\"%{GREEDYDATA:referer}\")?\s(\"%{GREEDYDATA:user_setup}\")?\s%{GREEDYDATA:canonical}' }
remove_field => ["message"]
}
if [request] == "nagios.plc" {
drop {}
}
grok {
match => {"request" => "XX: (?<request>([0-9]*))" }
match => {"request" => "Format: (?<request>(\D+).jpg") }
Then I tried to move it before removing the message in first grok and replacing first "request" with message as logstash might not know request at that time, but logs still were saying something is wrong with those matches.
Should it be one match instead 3 ?
Should I use different plugin for logstash?
You have a field that contains /im/56/d9/17/z25006678F,Barka-hotelowa-na-Wisle.jpg
and you are wondering why neither "XX: (?([0-9]*))" nor "Format: (?(\D+).jpg" match that. Is that your question?
No, they are not. In the window on the right it says "Your regular expression does not match the subject string." This is why grok does not create any fields: the pattern does not match. Note the "XX: matches the characters XX: literally (case sensitive)" which is not coloured. The characters "XX: " do not occur in your string so they do not match.
If you remove the "Format: " your second pattern will match "F,Barka-hotelowa-na-Wisle.jpg". I think you want
grok { match => { "request" => "z(?<r>[0-9]+)" } }
grok { match => { "request" => "(?<s>\D),\D+\.jpg" } }
If you just remove the XX: then your (?<request>([0-9]*)) matches your string 43 different times, and grok would just return the first one. * means zero or more and
in "/im/56/d9/17/z25006678F,Barka-hotelowa-na-Wisle.jpg" there are zero digits before the leading /, so it matches there (and grok will match a nil string there), then there are zero digits before the i, so it matches there, zero before the m, so it matches there, and zero before the next / so it matches there again. Then it matches 56, and so on. If you change it to (?<request>([0-9]+)) then it will match 56. You might be able to use "(?<r>[0-9]{3,})", which means at least 3 digits.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.