Filebeat filestream input parsers multiline fails

Hello

This is filebeat 7.15.0.

I have configured several filebeat log inputs with multiline patterns and it works. Next I change the input type to filestream, while following the documentation. However, when starting the beat it fails with this error message:

2021-11-30T10:44:40.888+0100 ERROR instance/beat.go:989 Exiting: Failed to start crawler: creating input reloader failed: error while parsing multiline parser config: unknown matcher type: accessing '0.parsers.0.multiline' accessing '0' (source:'/etc/filebeat/inputs.d/logs.dockerhost.yml')
The relevant configuration is this:

parsers: - multiline: type: pattern # Collects multi-line exceptions - lines that do not match the pattern which comes after a line that matches the pattern pattern: '^([0-9]{4}-[0-9]{2}-[0-9]{2})T[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\s\b(ERROR|SEVERE)' negate: false match: before timeout: 10s max_lines: 50
Best regards
Flemming
type or paste code here



type or paste code here

I have reviewed the configuration several time and reread the documentation, but the result is the same. The input type filesstream configures with parsers type pattern does not work despite the filebeat documentation.

I have also looked in the filebeat.yml reference. It does not documents the parsers type pattern.

Next, I kept the input type filestream, but configured the multiline patterns. There several filestream inputs with different multiline patters in the file under the input.d directory.

This configuration seems to work

`
'multiline.type: pattern
# Collects multi-line exceptions - lines that do not match the pattern which comes after a line that matches the pattern

multiline.pattern: '^([0-9]{4}-[0-9]{2}-[0-9]{2})T[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\s\b(ERROR|SEVERE)'
multiline.negate: false
multiline.match: before
multiline.timeout: 10s
multiline.max_lines: 50

Perhaps there is a bug in either the filestream parsers option or the documentation?

Best regards
Flemming
type or paste code here

Hi @fgjensen !

I think that filestream input could have slightly different configuration options than log input. Could you elaborate more on what is the difference you applied on your configuration and and how it is missing from the documentation? Then we can fix it in the docs if needed :).

C.

Hi @ChrsMark

The Filebeat version 7.15 filestream input documentation states this configuration example for the multiline pattern:

filebeat.inputs:
- type: filestream
  ...
  parsers:
    - ndjson:
      keys_under_root: true
      message_key: msg
    - multiline:
      type: counter
      lines_count: 3

Following the documentation for the multiline pattern I have rewritten this to

filebeat.inputs:
- type: filestream
  ...
  parsers:
    - multiline:
      keys_under_root: true
      type: pattern
      pattern: '^([0-9]{4}-[0-9]{2}-[0-9]{2})T[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\s\b(ERROR|SEVERE)'
      negate: false
      match: before
      timeout: 10s
      max_lines: 50

With this configuration Filebeat throws the error shown in my first post and then stop.
When I change the configuration to the one shown in my second post Filebeat does not throw an error nor stops.

From this I conclude the parsers option with multiline pattern does not work as stated in the documentation. Perhaps I am wrong but then I would like to learn how to configure the parsers option with multiline

BR
Flemming

Thanks for clarifying @fgjensen ! Could you open a Github issue stating that the suggested configuration from the docs throws an error?

C.

Hi everyone,
I got the same problem and it drives me crazy
let's imagin i have the below logs:

18-12-2021 10:15:16.446 somelogs
somelogs
18-12-2021 10:15:20.446 somelogs
somelogs

iam trying to send these logs from filebeat directly to Elasticsearch so this is my config:

filebeat.inputs:
- type: filestream    
  enabled: true
  paths:
    - /root/21.log
  parsers:
    - multiline:
      type: pattern
      pattern: ^([\d-]{8,10} [\d:.]{8,12})

There is also an error in filebeat log which says:

Exiting: Failed to start crawler: starting input failed: Error while initializing input: error while parsing multiline parser config: unknown matcher type:  accessing 'filebeat.inputs.0.parsers.0.multiline' accessing 'filebeat.inputs.0' (source:'/etc/filebeat/filebeat.yml'

This is a me-too on this issue... while attempting to update from a (quite elderly) filebeat config which contained a multiline pattern match on one file:

- type: filestream

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /home/glassfish/glassfish/domains/domain1/logs/server.log

  parsers:
    - multiline:
      type: pattern
      pattern: '^\['
      negate: true
      match: after

it throws the error:

Exiting: Failed to start crawler: starting input failed: Error while initializing input: error while parsing multiline parser config: unknown matcher type:  accessing 'filebeat.inputs.1.parsers.0.multiline' accessing 'filebeat.inputs.1' (source:'filebeat.yml')

As with other people, that's my interpretation of the manual for filebeat multiline handling. Which doesn't work as documented. The only difference? I'm running filebeat version 7.16.2

As with fgjensen, modifying the layout to that given in Manage multiline messages | Filebeat Reference [7.16] | Elastic and filebeat starts up... just a question of checking if the layout does what is expected now! Configuration that does allow startup...

- type: filestream

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /home/glassfish/glassfish/domains/domain1/logs/server.log

  multiline.type: pattern
  multiline.pattern: '^\['
  multiline.negate: true
  multiline.match: after


so filestream input | Filebeat Reference [7.16] | Elastic as a manual is inconsistent with current behaviour.

Hi @mattml

I have had Filebeat 7.15.0 running on several servers with the filestream input type and the multiline pattern configured as both you and I states. It works well with that configuration.

The documentation is incomplete.

Best regards
Flemming

Hi @fgjensen
I've been struggling with this for days, I had the configuration as type: log on Filebeat 7.15 and I started over with version 7.16.2 and the type: filestream.
The documentation is definitely incomplete. It's an indentation issue.
The config that just two min ago worked for me is:

filebeat.inputs:
- type: filestream
  enabled: true
  paths:
    - /var/log/*.log
  parsers:
    - multiline:
        type: pattern
        pattern: '^\['
        negate: true
        match: after

Hope this could help someone else.
Cheers

1 Like

Thank you for the confirmation that the configuration works as per your example… it saved much heartache in rewriting the multiline clause. Now I’ve just got the ‘fun’ of wading through Puppet’s even less usable documentation to make Puppet deploy the Filebeat 7 compatible configuration file. Bad documentation seems to be quite common!

Matthew

Hi @Ruben_Bracamonte and @mattml

I realized the multiline configuration was accepted by Filebeat at least Filebeat did not stop. However, it just skipped all mulitilines in an exception.

So I change the configuration as @Ruben_Bracamonte suggests and that kind of works. However, some of my following processors are not working. If I comment out the multiline parsers then all processors work.

This quite annoying. Is there an order, so the processors must be configured before the multiline parser?

Best regards
@fgjensen

I guess I just ran into the issue of processors as mentioned by @fgjensen
My paths depend on kubernetes metadata fields, using filestream and configuring multiline as:

    multiline.type: pattern
    multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:|^Message History'
    multiline.negate: false
    multiline.match: after

The filebeat starts up and then error out as it can't access the log file path, which is:

/var/lib/docker/containers/${data.kubernetes.namespace}_${data.kubernetes.pod.name}_${data.kubernetes.pod.uid}/${data.kubernetes.container.name}/**/*.log*

and the error logs:

{"level":"info","timestamp":"2022-01-07T16:28:59.220Z","logger":"monitoring","caller":"log/log.go:160","message":"Stopping metrics logging."}
{"level":"info","timestamp":"2022-01-07T16:28:59.220Z","caller":"instance/beat.go:498","message":"filebeat stopped."}
{"level":"error","timestamp":"2022-01-07T16:28:59.220Z","caller":"instance/beat.go:1015","message":"Exiting: Failed to start crawler: starting input failed: Error while initializing input: missing field accessing 'filebeat.inputs.1.paths.0' (source:'filebeat.yml')"}
Exiting: Failed to start crawler: starting input failed: Error while initializing input: missing field accessing 'filebeat.inputs.1.paths.0' (source:'filebeat.yml')

Best Regards,
Ayush Mathur

Hi @fgjensen ,

After encountering the same issue and wasting some hours on it, I managed to debug the issue thanks to the fact that filebeat is opensource.

Basically, the error says "unknown matcher type: ". After ':', the code places the value found for multiline.match. As in my case, this space is blank, which means the parsers in unable to find the value specified in the configuration.

There are 2 workarounds to this:

  1. the one you mentioned where you specifically type multiline.match
  2. indent all entries under the multiline statements (and this is actually where the documentation is wrong). Instead of looking like this:
filebeat.inputs:
- type: filestream    
  enabled: true
  paths:
    - /root/21.log
  parsers:
    - multiline:
      type: pattern
      pattern: ^([\d-]{8,10} [\d:.]{8,12})

it should look like this

filebeat.inputs:
- type: filestream    
 enabled: true
 paths:
   - /root/21.log
 parsers:
   - multiline:
       type: pattern
       pattern: ^([\d-]{8,10} [\d:.]{8,12})

@Emde not sure what exactly is wrong with filestream implementation, but when I'm trying to give multiline.(type/pattern/negate/match) OR as per your suggestion:

                  parsers:
                    - ndjson:
                        keys_under_root: true
                        ignore_decoding_error: true
                        message_key: message
                        expand_keys: true
                    - multiline:
                        pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:|^Message History'
                        negate: false
                        match: after

my logs are still coming in as plain string and not parsed as JSON and multi-line is not working.

@Ayush_Mathur can you give us a sample of the text you are expecting to be covered by this configuration?

The configuration of filestream multiline is not working.

Seems the parsers is an array type, so the below works for me,

parsers:
- multiline.type: pattern
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after

Notice that there is a '-' at begin of multiline.

Or

parsers:
- multiline:
    type: pattern
    pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
    negate: true
    match: after

Notice that there are two spaces at begin of type:

Opened [DOCS] [Filebeat] Confusing multiline documentation for filestream · Issue #29919 · elastic/beats · GitHub

Hello together,
I was struggeling with multiline on 7.16, too. My "old" 7.8 filebeat worked well with input "type: log". I nearly tried ALL possibilities I can imagine, but without success.
After just finding this discussion, I immediately tried out the proposals (indenting).
Now I did not get any filebeat starting problems, but multiline is also not working with "parsers".

I "accidentally" put in my "old" config with input "type: logs" again and the "old multiline standard" configuration ... and it works, but only that! ALL other suggestions did NOT work with filebeat 7.16 ...

- type: log
  enabled: true
....
  multiline.pattern: '^INFO\s+\| jvm 1\s+\| [0-9]{4}\/[0-9]{2}\/[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3} \| \['
  multiline.negate: true
  multiline.match: after

Hello all;

After at lot of work on this I have Filebeat 7.15.0 configured with 12 different, pretty complex input filestreams. Most of the inputs are also configured with a multiline parser.

The 12 logs in this application are not configured in a similar manner. This adds of course complexity to the inputs. It was also awkward to collect test input from the many logs to cover all the different outcomes.

To sum up: the second workaround from @Emde made my day(s). Besides that I had very good help from these 3 tools:

  1. Regex Testing
  2. The Go Mulitiline Playground
  3. The dissect processor tester by Jorgelbg

Great tools, which saved me a lot of tedious testing time. It would be nice, if the Go Playground supports all Filebeat multiline configuration settings. I hope the maintainer reads this!

Best regards
Flemming

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.