My source is filebeat with correctly configured multi-line support. Using logstash (v5.6.16), I am trying to extract the data between two marker strings, the second of which is optional. This all works fine unless the source data was split over multiple lines.
Initial grok pattern (single line source): "Marker1 (?.*?)Marker2"
"message" : "Marker1 This is the first part and this is the second part Marker2"
"data": " This is the first part and this is the second part "
However, if message includes a line break, I get no match:
"message" : "Marker1 This is the first part
and this is the second part Marker2"
To address this, I changed the pattern to "Marker1 (?(.|\n)*?)Marker2", which works correctly with both the single and two-line messages:
"data": " This is the first part and this is the second part"
Now I need to make Marker2 optional, so changed the pattern to:
"Marker1 (?(.|\n)*?)(Marker2|$)", which also works correctly for the single-line message, but not for the multi-line message, which gives:
"message" : "Marker1 This is the first part
and this is the second part"
"data" : "This is the first part "
The problem is that the $ is matching the end of the first line and not the end of the string.
If I replicate the problem in regex101.com, I can add the "(?s)" modifier string at the beginning of the pattern, which makes $ only detect the end of the string, not end of line.
However, I can't find a similar option for grok - it supports "(?m)" but gives a compile error for "(?s)".
Can anyone help me solve this problem please.
Thanks,
Andrew