Inconsistent Multiline Matches


(Sadik Tekin) #1

After days of trial and error I've found inconsistent RegEx pattern matches with filebeat that seem to work when tested elsewhere: golang and RegEx101

Filebeat configuration:

  filebeat:
     prospectors:
       - input_type: log
     paths:
       - /var/opt/SFTP7_PC/logs/session_logs/session_SFTP_*.log
     encoding: plain
     include_lines:
       - 'S T O R*'
       - 'R E T R*'
     fields_under_root: false
     document_type: log
     scan_frequency: 10s
     harvester_buffer_size: 16384
     max_bytes: 10485760
  
  multiline:
    pattern: '\b\|(\d{2})\/(\d{2})\/(\d{4})\s+(\d{2}):(\d{2}):(\d{2}).(\d{3})\|\b'
    negate: false
    match: after
    max_lines: 3
  tail_files: false`

This configuration works but only seems to match and concatenate the lines containing "226-Upload"
See Here

All log lines begin with SESSION - is there an obvious reason why a simple pattern fails to match any lines?

When using ^SESSION or \bSESSION\b or even removing literal pipes from the original:
\b(\d{2})\/(\d{2})\/(\d{4})\s+(\d{2}):(\d{2}):(\d{2}).(\d{3})\b
Filebeat debug always backs off:

2017-01-12T11:09:41Z INFO Harvester started for file: /var/opt/SFTP7_PC/logs/session_logs/session_SFTP_1069386.log
2017-01-12T11:09:41Z DBG  End of file reached: /var/opt/SFTP7_PC/logs/session_logs/session_SFTP_1069386.log; Backoff now.
2017-01-12T11:09:42Z DBG  End of file reached: /var/opt/SFTP7_PC/logs/session_logs/session_SFTP_1069386.log; Backoff now.
2017-01-12T11:09:43Z DBG  End of file reached: /var/opt/SFTP7_PC/logs/session_logs/session_SFTP_1069386.log; Backoff now.
2017-01-12T11:09:44Z DBG  End of file reached: /var/opt/SFTP7_PC/logs/session_logs/session_SFTP_1069386.log; Backoff now.
2017-01-12T11:09:45Z DBG  End of file reached: /var/opt/SFTP7_PC/logs/session_logs/session_SFTP_1069386.log; Backoff now.
2017-01-12T11:09:46Z DBG  Flushing spooler because of timeout. Events flushed: 1
2017-01-12T11:09:46Z DBG  No events to publish`

Thanks!


(Steffen Siering) #2

TBH, from your post I don't fully understand the actual problem you're facing.

Which filebeat version are you using?

Do you have some sample logs - content - for testing with this playground?

multiline support has a timeout in case of an event still being buffered, but no file update for N seconds. The timeout flushes the current buffer. Have you tried to disable the timeout? e.g. if your upload/download exceeds the multiline timeout, it's not correctly combined.

From your link I can not really tell which lines you exactly want to merge + it's helpful to see a more complete log. I wonder if you really want multiline (multiline only merges successive line) or wether you need some form of joining multiple lines by some key. Problem is: what if multiple concurrent upload/downlods are active? Will the logs for the 2 sessions be intermixed? In the later case you might have a 'bigger' problem as there doesn't really seem to be a consisten session-id being logged for all messages.

What's the issue with backoff? By default the reader in filebeat uses multiple processing layers. 1) read file 2) split lines 3) multiline ... The backoff comes from first layer, as the reader has reached end of file (can not read any more content, as OS signals the reader reached the end of the file). In case of having no content, the reader will wait and retry reading content from your log (See options backoff and max_backoff).


(Sadik Tekin) #3

Thanks for the reply, finally have the best pattern match and it's a lot simpler than first thought, to be clearer all I needed was to match a single line and then concatenate the following 2 into a single string for Logstash groks. The mistake I made was to find a match for ALL lines!

To avoid errors as I did, might I mention multiline regex patterns should ONLY match the initial line and not the consecutive lines that require appending

I had to negate the next 2 entries that matched the multiline pattern, this was simplified to: '\s+R\*$' which successfully matches all lines ending S T O R* and R E T R*. So the only thing changed in the config was the simplified pattern and negate set to true.

As these are SFTP logs; in regards to concurrent up/downloads (well spotted by the way) each SFTP session creates its own log file avoiding intermixing messages.

As larger files will inevitably take longer to transfer I increased values: backoff: '5m' and max_backoff: '10m'

Further example logs can be found here

Now onto the fun stuff with Kibana!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.