Matching multiple patterns in grok for a filebeat ingestion pipeline

In logstash's grok, there's a break_on_match field that allows grok to match multiple patterns. Am I correct in believing that no such thing exists for grok in filebeat module ingestion pipelines (e.g., .../filebeat/module/my_module/my_fileset/ingest/pipeline.json)?

How can I accomplish that?

The logs I'm grokking are pretty long. Here's an example:

2019-12-28T19:14:32.848+0000: 238687.843: [GC pause (G1 Evacuation Pause) (young), 0.0572720 secs]
   [Parallel Time: 34.1 ms, GC Workers: 13]
      [GC Worker Start (ms): Min: 238687843.5, Avg: 238687843.7, Max: 238687843.8, Diff: 0.3]
      [Ext Root Scanning (ms): Min: 0.3, Avg: 1.1, Max: 7.9, Diff: 7.6, Sum: 14.7]
      [Update RS (ms): Min: 0.0, Avg: 0.6, Max: 0.8, Diff: 0.8, Sum: 7.4]
         [Processed Buffers: Min: 0, Avg: 3.5, Max: 12, Diff: 12, Sum: 45]
      [Scan RS (ms): Min: 0.0, Avg: 0.4, Max: 0.5, Diff: 0.5, Sum: 4.6]
      [Code Root Scanning (ms): Min: 0.0, Avg: 1.9, Max: 5.1, Diff: 5.1, Sum: 25.1]
      [Object Copy (ms): Min: 25.7, Avg: 29.1, Max: 31.9, Diff: 6.2, Sum: 378.5]
      [Termination (ms): Min: 0.0, Avg: 0.5, Max: 0.6, Diff: 0.5, Sum: 6.2]
         [Termination Attempts: Min: 1, Avg: 48.8, Max: 62, Diff: 61, Sum: 634]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 1.1]
      [GC Worker Total (ms): Min: 33.5, Avg: 33.7, Max: 33.8, Diff: 0.2, Sum: 437.6]
      [GC Worker End (ms): Min: 238687877.3, Avg: 238687877.3, Max: 238687877.4, Diff: 0.2]
   [Code Root Fixup: 0.4 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 1.2 ms]
   [Other: 21.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 20.0 ms]
      [Ref Enq: 0.2 ms]
      [Redirty Cards: 0.4 ms]
      [Humongous Register: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.8 ms]
   [Eden: 14496.0M(14496.0M)->0.0B(14496.0M) Survivors: 224.0M->224.0M Heap: 14745.9M(24576.0M)->240.1M(24576.0M)]
 [Times: user=0.49 sys=0.00, real=0.06 secs]

Putting all my grok patterns into a single pattern would result in the pattern being over 5,000 characters long. At the moment, I'm trying multiple grok processors, one for each pattern. We'll see how that goes.

Thanks!

Hi @Jim_Ivey I believe Filebeat does break_on_match by default(without a specific config parameter for it). Documentation on grok pattern also shows Returns on the first expression in the list that matches: https://www.elastic.co/guide/en/elasticsearch/reference/master/grok-processor.html#using-grok

Thanks for the quick response. I wanted to turn break_on_match off (set to false) to match multiple patterns in a single grok processor. That would allow me to break the 5,000-character pattern into multiple, shorter patterns.

What I've done (that works) is use a separate grok processor for each sub-pattern. I wasn't sure that I could have a dozen or so grok processors, but it worked.

Thanks again for your help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.