Multiline POSIX regexp pattern - mindmelting not working java stack trace pattern


(Daniel Sand) #1

Hey Folks,

im currently on the horizon of make a switch from:

logstash 1.4.x using lumberjack as shipper => logstash 2.x and filebeat as shipper

so i try to catch JAVA Stack traces as one message with filebeat multiline pattern

---
filebeat:
  prospectors:
  - paths:
    - "/var/log/kafka/*.log"
    ignore_older: 24h
    document_type: kafka
    input_type: log
    scan_frequency: 15s
    harvester_buffer_size: 16384
    multiline:
      pattern: '(^[[:digit:]]+[[:space:]]error)|(^.+Exception: .+)|(^[[:space:]]+at .+)|(^[[:space:]]+... [[:digit:]]+ more)|(^[[:space:]]*Caused by:.+)'
      negate: true
      match: after

i converted the logstash pattern from RegExp to POSIX Regexp and tried multiple variations like:

pattern: (^[[:digit:]]+[[:space:]]error)|(^.+Exception: .+)|(^[[:space:]]+at .+)|(^[[:space:]]+... [[:digit:]]+ more)|(^[[:space:]]*Caused by:.+)
pattern: '(^[[:digit:]]+[[:space:]]error)|(^.+Exception: .+)|(^[[:space:]]+at .+)|(^[[:space:]]+... [[:digit:]]+ more)|(^[[:space:]]*Caused by:.+)'
pattern: "(^[[:digit:]]+[[:space:]]error)|(^.+Exception: .+)|(^[[:space:]]+at .+)|(^[[:space:]]+... [[:digit:]]+ more)|(^[[:space:]]*Caused by:.+)"
pattern: "(^d+serror)|(^.+Exception: .+)|(^s+at .+)|(^s+... d+ more)|(^s*Caused by:.+)"
pattern: (^d+serror)|(^.+Exception: .+)|(^s+at .+)|(^s+... d+ more)|(^s*Caused by:.+)
pattern: '(^d+serror)|(^.+Exception: .+)|(^s+at .+)|(^s+... d+ more)|(^s*Caused by:.+)'
pattern: \(\^\[\[:digit:\]\]\+\[\[:space:\]\]error\)\|\(\^\.\+Exception: \.\+\)\|\(\^\[\[:space:\]\]\+at \.\+\)\|\(\^\[\[:space:\]\]\+\.\.\. \[\[:digit:\]\]\+ more\)\|\(\^\[\[:space:\]\]\*Caused by:\.\+\)

im kinda lost - i tested the regexp in multiple ways:
http://regexr.com/
http://www.regexplanet.com/advanced/golang/index.html

they all are working and finding the stack trace

example:

java.io.EOFException: Received -1 when reading from channel, socket has likely been closed.
    at kafka.utils.Utils$.read(Utils.scala:381)
    at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
    at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
    at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
    at kafka.network.BlockingChannel.receive(BlockingChannel.scala:111)
    at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:133)
    at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)

i also followed as far i could the documentation page on the matter of regexp

https://www.elastic.co/guide/en/beats/filebeat/current/regexp-support.html#unsupported-regexp-patterns

im using the debug parameters for the cli
filebeat -e -c /etc/filebeat/filebeat.yml -d "*"

if i remove the multiline section all is fine but the java stack trace is cluttered
as soon i try the different patterns - it just gets ignored. No error or anything else - the log gets completely ignored.

so has somebody maybe a Java stack pattern that works for multiline or any clue where i could have choosen the wrong path or just a way to debug this properly - you would make me a 127% happy camper !

thanks :slight_smile:


(Andrew Kroh) #2

It looks like you are missing multiline: in your configuration file. https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html#multiline


(Daniel Sand) #4

sorry copy and paste error from my side - its inside

well and after totally rigging the post - i rest my case -_____-


(Daniel Sand) #5

you where right -_- the regexp is now partially working - after half of a day i got reckless.

thanks for the heads up


(Steffen Siering) #6

regex looks pretty complicated. Why not just check for whitespace?


(Daniel Sand) #7

well fair enough - this works so far. thanks :slight_smile:

multiline:
  pattern: ^[[:space:]]
  match: after

(system) #8