Disappearing logs when using regex in grok match

harrytewkesbury · August 19, 2016, 1:19pm

Hi - I have a strange issue that I can't make heads or tails of. When I parse a log type for %{HTTPDATE} it works fine, but when I also want to add %{IPORHOST}, then the entire log never makes it to elasticsearch - I look by tag, and by text search etc. Here are the stanzas:

if "apache" in [tags] and "external" in [tags] and "legacy" in [tags] { grok { match => [ "message", "%{IPORHOST:src_ip}.+*%{HTTPDATE:timestamp}" ] overwrite => [ "timestamp", "message" ] tag_on_failure => [ "grokfail_legacy" ] } date { match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ] add_tag => "dateparsesuccess_legacy" } }

Filebeat is tagging correctly (I can see the fields when grepping redis-cli) and when I remove that IPORHOST pattern, and have it just:

match => [ "message", "%{HTTPDATE:timestamp}" ]

It works ok. Trouble is, I want that IPAddress! Is this a bug, or am I doing something dumb? Thanks.

harrytewkesbury · August 19, 2016, 1:42pm

And of course, inevitably I discover that regex actually chops off the 1 of the 19th. I could swear I couldn't find the relevant tags though.

Any idea how I could match the IP and the timestamp of a log like this without screwing up?

95.87.154.242, 141.101.92.152 [-] - - [19/Aug/2016:13:13:02 +0000] "GET /blog/feed/ HTTP/1.1" 301 519 "-" "UniversalFeedParser/5.2.1 +https://code.google.com/p/feedparser/"

magnusbaeck · August 22, 2016, 11:00am

Which IP address(es) do you want to capture? If you want to capture more than one, where should they be stored? All in one field (i.e. making it an array field)? Something else?

harrytewkesbury · August 22, 2016, 11:38am

I managed to work out the capture, thanks. Just a stupid typo in my regex - the second IP is Cloud Flare, so not necessary to keep. . The correct (or at least working) solution for me is simply:

match => [ "message", "%{IPORHOST:clientip}+.*\[%{HTTPDATE:timestamp}" ]

This claims the first IP, skips everything else up until the HTTPDATE and claims that as "timestamp". All good. Thanks.

Topic		Replies	Views
Provided Grok patterns do not match data in the input Logstash	5	2419	February 2, 2020
Completly lost with how to parse date Logstash	2	286	August 5, 2019
Matching with "date" creates _grokparsefailure but matches are okay Logstash	3	3090	July 6, 2017
_grokparsefailure problem Logstash	3	362	September 4, 2019
_grokparsefailure still happening for a small number of IIS log entries - how to track them down? Logstash	7	1215	July 6, 2017

Disappearing logs when using regex in grok match

Related topics