Logstash 1.5.x grok regex multiline behavior change?


(Bitsof Info) #1

Hi,

Currently upgrading from 1.4.1 to 1.5.4

This filter works fine in 1.4.x : https://github.com/bitsofinfo/logstash-modsecurity/blob/master/logstash-modsecurity.conf

When I run it in 1.5.4 it "sort of" works. There seem to be issues around how ".+" matches newlines in logstash 1.5.x.

Before I go to far off into the weeds here, I am curious. What changed in the regex/grok engine from 1.4.1 to 1.5.4 that would result in a default change in how grok patterns interpret ".+" when it comes to matching new lines?

Here is a basic example:

RAW DATA:

GET /etc/passwd HTTP/1.1
User-Agent: Wget/1.15 (darwin13.1.0)
Accept: */*
Host: my.host
Connection: Keep-Alive

GROK:

grok {
    match => {
             "message" => "%{DATA:httpMethod}\s(?<requestedUri>\S+)\s(?<incomingProtocol>.+)\n{1}"
    }
}

EXPECTED (3 fields) (and this is what I get with logstash 1.4.1)

httpMethod = GET
requestedUri = /etc/passwd
incomingProtocol = HTTP/1.1

1.5.4 RESULT (incomingProtocol contains more data that it should)

httpMethod = GET
requestedUri = /etc/passwd

incomingProtocol
HTTP/1.1
User-Agent: Wget/1.15 (darwin13.1.0)
Accept: /
Host: my.host
Connection: Keep-Alive


(Bitsof Info) #2

Note it appears that logstash 1.5.4 grok is matching newlines when I do ".+", but if i do ".+?" i only get the 1st line... shouldn't this be off by default?


(Bitsof Info) #3

Anyone?


(Bitsof Info) #4

anyone?


(system) #5