How to get all matches for a grok pattern in a multiline message

(markus mayer) #1

I am having a multiline log format that consists of an XML document with a very deep message structure. All elements in the data structure can have a child element called Message containing a value attribute like so:

              <Message value="foo"/>
         <Message value="bar"/>
    <Message value="even more bar"/>

This means the number of elements varies from message to message and can get quite large. What I would like to do is to build a Kibana table showing all message values with their respective number of occurences.

To do so I thought I can extract an array of all value attributes in my multiline message with a pattern like:

    grok {
        break_on_match => false
        match => ["message","<Message value=\"%{DATA:msgText}"]

However grok only finds the first match for my pattern. If it is not possible to get all matches, is it possible to get the first n matches?
If this is not possible using logstash filtering can I get something like that done on the elasticsearch/kibana side?

Any help appreciated.

(Magnus B├Ąck) #2

Can't you use the xml filter? It looks like the xpath parameter should do exactly what you want:

Values returned by XPath parsring [sic] from xpath-synatx [sic] will be put in the destination field. Multiple values returned will be pushed onto the destination field as an array.

(markus mayer) #3

Thanks Magnus,
works like a treat. You saved my day.

(Shaun Wells) #4

Any chance of seeing what your config looks like when using the XML Filter as I'm having a few issues myself..

(markus mayer) #5

Hi Shaun,
my config for the xml filter is pretty simple. I use it like this with an XPATH to filter the parts I'm interested in.

    xml {
        source => "message"
        target => doc
        store_xml => false
        xpath => [  "//Message[@id!=0]","msgs"]

(system) #6