How to split and parse message properly with different plugins?


#1

Hi,
I've been cracking my head for some time now but i just can't get it right.
Logstash version 5.5.2

The goal is simple: take first part of message, treat it as 'timestamp' [ optional: drop/omit the first part, ], take the second part, parse xml/soap message and put it to Elasticsearch.
I can get timestamp OR i can parse xml part, but i can not do both at the same time. Using filters with split+mutate (see below) almost does it, but produces extra message.

Log entry (shortened, they could be very long, but basic structure is like that):

2017-08-12 01:37:52 <?xml  version="1.0" encoding="UTF-8"?><S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/"><ns2:producer>12345678</ns2:producer><S:Header><xmlns:ns4="http://schemas.xmlsoap.org/soap/encoding/"></S:Header><S:Body><ns4:code xsi:type="xsd:string">XYZsomething</ns4:code></S:Body></S:Envelope>

Filter config:

filter {
        grok {
                match => [ "message", "%{TIMESTAMP_ISO8601:messageTimestamp} " ]
        }
        date {
                locale => "en"
                match => [ "messageTimestamp", "yyyy-MM-dd HH:mm:ss" ]
                target => "@timestamp"
       }

Above works as expected. Now following is problematic:

split {
        field => "message"
        target => "msg"
        terminator => "<?xml"
       }

It does split the message but it instead of creating array (and elements for both part eg. arr[0], arr[1] so i could manipulate both parts independently) it creates just two "msg" ..how to say.."two parts with same name"? I fail to understand why not create array? How could i adress only one of them? Anyhow. Now i tought that i reconstruct xml part and parse it.

mutate {
                replace => [ "msg", '<?xml%{msg}' ]
        }
 xml {
                source => "msg"
                store_xml => "false"
                target => "mytarget"
                xpath => [
                        "//*[local-name()='producer']/text()", "producer",
                        ]
         }

Now at first look it works, but it produces 2 records instead of one (example:)

{
           "msg" => "<?xml2017-08-12 01:37:52 ",
    "@timestamp" => 2017-08-11T22:37:52.000Z,
      "@version" => "1"
}
{
           "msg" => "<?xml  version=\"1.0\" encoding=\"UTF-8\"?><S:Envelope xmlns:S=\"http://schemas.xmlsoap.org/soap/envelope/\"><ns2:producer>12345678</ns2:producer><S:Header><xmlns:ns4=\"http://schemas.xmlsoap.org/soap/encoding/\"></S:Header><S:Body><ns4:code xsi:type=\"xsd:string\">XYZsomething</ns4:code></S:Body></S:Envelope>",
    "@timestamp" => 2017-08-11T22:37:52.000Z,
      "@version" => "1",
      "producer" => [
        [0] "12345678"
    ]
}

Question is: how can i get time field as timestamp AND parse xml later on? Or how to omit first part of "msg"?

Thanks.


#2

Hi,
Whining helps.
I found working solution for myself shortly after posting.
grok-match line is the key.

filter {
        grok {
                match => [ "message", "%{TIMESTAMP_ISO8601:messageTimestamp} %{GREEDYDATA:msg}" ]
        }
        date {
                locale => "en"
                match => [ "messageTimestamp", "yyyy-MM-dd HH:mm:ss" ]
                target => "@timestamp"
        }
        xml {
                source => "msg"
                store_xml => "false"
                target => "mytarget"
                xpath => [
                        "//*[local-name()='producer']/text()", "producer",
        }

        mutate {
                remove_field => "[message]"
                remove_field => "[msg]"
                remove_field => "[messageTimestamp]"
                remove_field => "[path]"
        }
}

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.