XML plugin parse

adrianfusco · May 6, 2022, 12:28pm

Hello,

I've been using the XML filtering plugin because I need to parse some XML data.

This is a simple example:

<task code="a01" status="wip"/>
<task code="a02" status="nwg"/>
<task code="a03" status="nwg">
     Description Line 1
     Description Line 2
     Description Line 3
     Description Line 4
</task>
<task code="a04" status="wip">
     <comment author="afusco">
         I've finished this part.
     </comment>
</task>

I'm trying to extract the code of these tasks

filter {
  xml {
    source => "message"
    store_xml => false
     xpath => [
       "/task/@code", "task_code",
       "/task/@status", "task_status",
     ]
  }
}

The thing is:

It's filtering correctly the lines that contains just a <task> tag. I can see the output is correct. But when it's processing the rest of the lines, for example, comment tags, it's parsing wrong.

To avoid it, I added the following simple condition just to drop the lines aren't task tags:

  if [message] !~ /^<task/ {
    drop { }
  }

But this is a workaround.

Exists any way to just parse the desired specific tags and at this way Elasticsearch doesn't receive also the undesired data? It could be good to drop the data if it's not in the xpath array.

Example of the output:

When it's a task tag:

{
    "path" => "/var/log/xml2.log",
    "@timestamp" => 2022-05-05T16:22:24.602Z,
    "@version" => "1",
    "task_code" => [
        [0] "a01"
    ],
          "task_status" => [
        [0] "wip"
    ],

When it's not:

{
      "@version" => "1",
    "@timestamp" => 2022-05-06T12:27:01.559Z,
          "host" => "elastic",
       "message" => "\t<comment author="afusco">",
          "path" => "/var/log/xml2.log"
}

Thanks.

Badger · May 6, 2022, 4:41pm

Use a multiline codec on the input to consume an entire XML document as a single event.

Possibly

  codec => multiline {
      pattern => "<task"
      negate => "true"
      what => "previous"
      auto_flush_interval => 10
  }

system · June 3, 2022, 4:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
XML tag filtering Logstash	3	453	November 23, 2018
Xml parsing issue with xpath Logstash	7	2369	April 25, 2017
Extract only certain tags from an XML using XML filter plugin Logstash	3	534	November 4, 2020
Logstash XML Filter Issue Logstash	4	378	October 30, 2019
XML filter help Logstash	5	1545	July 6, 2017

XML plugin parse

Related topics