Split multi- nested XML in Logstash

Hi All,

Here's a sample of my xml data

<metric-datas>
<metric-data>
                  <metricId>120778</metricId>
                  <metricName>  Average Response Time (ms)  </metricName>
                  <metricPath> /ordering/Login.aspx  </metricPath>
                  <frequency>ONE_MIN</frequency>
<metricValues>
<metric-value>
                 <startTimeInMillis>15821000</startTimeInMillis>
                 <occurrences>3</occurrences>
                 <current>0</current>
                 <min>0</min>
                 <max>47</max>
                 <useRange>true</useRange>
                 <count>803</count>
                 <sum>1770</sum>
                 <value>2</value>
                 <standardDeviation>0</standardDeviation>
</metric-value>
</metricValues>
</metric-data>
</metric-datas>

The above has multiple nests of XML which I want to avoid and get only the values of metrics inside.

My expected output is:

{
metricId: 120778
metricName: Average Response Time (ms)
metricPath:  /ordering/Login.aspx
frequency: ONE_MIN
startTimeInMillis: 15821000
occurrences: 3
current: 0
min: 0
max: 47
useRange: true
count:803
sum:1770
value:2
standardDeviation: 0
}

as seperate row fields.

Here's my current conf file, which is faulty, however, I hope I can get it corrected with your help.

input {
http_poller {
urls => {
url => "https://appd.com/controller/virtual%20Response%20Time%20%28ms%29&time-range-type=BEFORE_NOW&duration-in-mins=5"
}
truststore => "path/to/cacerts.jks"
truststore_password => "*****"
request_timeout => 60
user => "*****"
password => "****"
metadata_target => "http_poller_metadata"
schedule => { cron => "* * * * * UTC"}
}
}
filter
{
xml {
                source => "[metric-datas]"
                store_xml => "false"
                       }
split {
                field => "[metric-datas]"
        }

}
output {
  elasticsearch {
    hosts => ["10.1.533.209:9200"]
    index => "appdmetric"
}
        stdout { codec => rubydebug }
}

Thank you!

Katara

If store_xml is false, and there are no xpath expressions, then all the xml filter does is validate the XML (and it is rather liberal in doing that). Also, if you change store_xml to true then you will not have a metric-datas field, the field inside target will be [metric-data].

Thank you @Badger, for helping me out,

I changed a few as per your suggestion,

   filter
     {
     xml {
                     source => "[metric-data]"
                     target => "xmldata"
                     store_xml => "true"
                           }
    split {
                     field => "[metric-data]"
             }
     
     }

However, my data is sill unsplit.

Anything im missing?
There has to be a data split in
<metric- data>
and also in
<metricValues> <metric-value>

Katara.

What does the rubydebug output on stdout look like?

@Badger, It shows the entire xml data under message.

If your XML is in a field called message then why are you telling the xml filter to look at a field called metric-data?

@Badger, Okay ,
I see what I'm missing ,
I changed my filter to the below,

filter
{
xml {
                source => "message"
                target => "xmldata"
                store_xml => "true"
                       }
split {
                field => "message"
        }

}

and I get this,
image

How do I eliminate the tags and make them a valid field name where necessary?
Kindly help me out!

Katara

I did some research and tried the below with a multiline:

input {
http_poller {
urls => {
url => "https://appd.com/controller/virtual%20Response%20Time%20%28ms%29&time-range-type=BEFORE_NOW&duration-in-mins=5"
}
truststore => "path/to/cacerts.jks"
truststore_password => "*****"
request_timeout => 60
user => "*****"
password => "****"
metadata_target => "http_poller_metadata"
schedule => { cron => "* * * * * UTC"}
codec => multiline {
            pattern => "<metric-datas>"
            negate => "true"
            what => "previous"
        }
}
}
filter
{
xml {
                source => "message"
                target => "xmldata"
                store_xml => "true"
xpath => [
            "/metric-datas/metric-data/metricID/text()", "metricID",
            "/metric-datas/metric-data/metricName/text()", "metricName",
            "/metric-datas/metric-data/metricPath/text()", "metricPath",
            "/metric-datas/metric-data/frequency/text()", "Frequency",
            "/metric-datas/metric-data/metricValues/metric-value/startTimeInMillis/text()", "starttime",
            "/metric-datas/metric-data/metricValues/metric-value/occurences/text()", "occurences",
            "/metric-datas/metric-data/metricValues/metric-value/current/text()", "current",
            "/metric-datas/metric-data/metricValues/metric-value/min/text()", "min",
            "/metric-datas/metric-data/metricValues/metric-value/max/text()", "max",
            "/metric-datas/metric-data/metricValues/metric-value/useRange/text()", "UseRange",
            "/metric-datas/metric-data/metricValues/metric-value/count/text()", "count",
            "/metric-datas/metric-data/metricValues/metric-value/sum/text()", "sum",
            "/metric-datas/metric-data/metricValues/metric-value/value/text()", "value",
            "/metric-datas/metric-data/metricValues/metric-value/standardDeviation/text()", "SatndardDeviation",
        ]
}

}

which is giving me an error:

`

[2020-02-26T07:38:23,938][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:appd, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, ", ', -, [, { at line 42, column 9 (byte 2006) after filter\n{\nxml {\n source => "message"\n\t\ttarget => "xmldata"\n store_xml => "true"\nxpath => [\n "/metric-datas/metric-data/metricID/text()", "metricID",\n "/metric-datas/metric-data/metricName/text()", "metricName",\n\t "/metric-datas/metric-data/metricPath/text()", "metricPath",\n\t "/metric-datas/metric-data/frequency/text()", "Frequency",\n\t "/metric-datas/metric-data/metricValues/metric-value/startTimeInMillis/text()", "starttime",\n\t "/metric-datas/metric-data/metricValues/metric-value/occurences/text()", "occurences",\n\t "/metric-datas/metric-data/metricValues/metric-value/current/text()", "current",\n\t "/metric-datas/metric-data/metricValues/metric-value/min/text()", "min",\n\t "/metric-datas/metric-data/metricValues/metric-value/max/text()", "max",\n\t "/metric-datas/metric-data/metricValues/metric-value/useRange/text()", "UseRange",\n\t "/metric-datas/metric-data/metricValues/metric-value/count/text()", "count",\n\t "/metric-datas/metric-data/metricValues/metric-value/sum/text()", "sum",\n\t "/metric-datas/metric-data/metricValues/metric-value/value/text()", "value",\n\t "/metric-datas/metric-data/metricValues/metric-value/standardDeviation/text()", "SatndardDeviation",\n ", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:41:in compile_imperative'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:49:in compile_graph'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:11:in block in compile_sources'", "org/jruby/RubyArray.java:2577:in map'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:10:in compile_sources'", "org/logstash/execution/AbstractPipelineExt.java:151:in initialize'", "org/logstash/execution/JavaBasePipelineExt.java:47:in initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:24:in initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:36:in execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:325:in block in converge_state'"]}

`

Not sure if im doing this right.

@Badger , pelase help me correct this

Katara

You cannot have a comma after the last entry in an array.

@badger,
I must have missed it. I corrected it and I still do not get the separate field.
Am I using the conditions right?
Please help me out.

Katara

If it helps anyone,
This is what worked for me, setting force_array => false is another change I made.

filter
{
xml {
                source => "message"
                target => "xmldata"
                store_xml => "true"
                force_array => false
xpath => [
            "/metric-datas/metric-data/metricID/text()", "metricID",
            "/metric-datas/metric-data/metricName/text()", "metricName",
            "/metric-datas/metric-data/metricPath/text()", "metricPath",
            "/metric-datas/metric-data/frequency/text()", "Frequency",
            "/metric-datas/metric-data/metricValues/metric-value/startTimeInMillis/text()", "starttime",
            "/metric-datas/metric-data/metricValues/metric-value/occurences/text()", "occurences",
            "/metric-datas/metric-data/metricValues/metric-value/current/text()", "current",
            "/metric-datas/metric-data/metricValues/metric-value/min/text()", "min",
            "/metric-datas/metric-data/metricValues/metric-value/max/text()", "max",
            "/metric-datas/metric-data/metricValues/metric-value/useRange/text()", "UseRange",
            "/metric-datas/metric-data/metricValues/metric-value/count/text()", "count",
            "/metric-datas/metric-data/metricValues/metric-value/sum/text()", "sum",
            "/metric-datas/metric-data/metricValues/metric-value/value/text()", "value",
            "/metric-datas/metric-data/metricValues/metric-value/standardDeviation/text()", "StandardDeviation"
        ]
}
mutate {
    remove_field => ["message","[http_poller_metadata][request][url]",
"[http_poller_metadata][response_headers][x-xss-protection]",
"[http_poller_metadata][response_headers][set-cookie]",
"[http_poller_metadata][response_headers][x-content-type-options]",
"[http_poller_metadata][response_headers][transfer-encoding]",
"[http_poller_metadata][response_headers][content-type]",
"[http_poller_metadata][response_headers][x-frame-options]",
"[http_poller_metadata][response_message]",
"[http_poller_metadata][request.method]",
"[http_poller_metadata][code]",
"[http_poller_metadata][times_retried]",
"[http_poller_metadata][name]",
"[http_poller_metadata][runtime_seconds]"]
}
}

Hope it helps!

Thanks :slight_smile:
Katara

Even better,

filter
{
xml {
                source => "message"
                store_xml => "true"
                target => "xmldata"
                force_array => false
}

simply this works with the input ive given!

Cheers! :slight_smile:

Katara.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.