Grok filter with spaces in field


I'm trying to work out how to build a grok filter with spaces in one of the fields but I'm not sure how to do it. The log looks like this:

2017-06-28 14:13:05 Coordinated Universal Time INFO [1234:0x0000019f] Loaded log manager '/path/to/some/interesting/things' for Platform

If I put quotes around "Coordinated Universal Time", I can use this filter:

%{TIMESTAMP_ISO8601:run_start} %{QUOTEDSTRING:timezone} %{WORD:log_level} %{DATA:id} %{GREEDYDATA:message_detail}

However I can't adjust the format of the original log file to put quotes around that field. Can anybody advise how I can do this with a grok filter please?

Thanks a lot!

The easier way is to look for some form of unique "breakpoints" in the log line and adapt your pattern around it. For instance, those square brackets could help you use GREEDYDATA to capture arbitrary strings and still keep regex greed in check.
Try this one:

Hi Paz,

That works, thanks!

But I don't understand why it works! I could understand if there were square brackets were around the log_level (INFO) because that would break the GREEDYDATA capture as it comes after the timezone, so I'd expect this expression to pick up timezone and log_level as one field?

I suppose I'm asking, why doesn't GREEDYDATA pick up the timezone as "Coordinated Universal Time INFO"? How does it know that they are two separate fields?


Because greediness relies a lot on backtracking. It captures the max possible amount of characters and starts removing characters from the capture group until all conditions are satisfied.

Here's a step-by-step progress in order to visualize it better (I converted the grok patterns to pure regex)

As you can imagine it is an expensive process, and depending on the substring position in the log line, it could make sense to use lazy grabbing instead of greedy (DATA instead of GREEDYDATA).

Thanks a lot for explaining.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.