Problem extracting multiple events from XML


#1

Hello,

I've been struggling with this for a few days now and have been trolling endlessly in the community for some complete examples to help me out.

I am currently have a problem trying to parse all the events from my XML file that contains many events.

I'm attaching a copy of what I have at work, in hopes someone sees something that I am missing. My problem is that I only get 1 event parsed (the first one) and all the others are not showing up.

I'm using logstash version 6.3.2 and below are the details.

Thanks,
Dean

logstash_test.conf

input {
  file {
    path => "/data/in/test/sample.xml"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => multiline
    {
      pattern => "<element>"
      what => "previous"
      negate => "true"
    }
  }
}

filter {
  mutate {
    gsub => [
      "message", "[\r\n]", ""
    ]
  }
}

filter {
  xml {
    source => "message"
    store_xml => "false"
    remove_namespaces => "true"
    xpath => [
        "/element", "element"
    ]
  }
  
  if [element] {
    split {
      field => "element"
    }
	
    xml {
      source => "element"
      store_xml => "false"
      xpath => [
        "/element/field1/text()", "field1",
        "/element/field2/text()", "field2",
        "/element/field3/text()", "field3"
      ]
    }
	
    mutate {
        remove_field => ["message", "element"]
    }
  }
}

output {
  stdout { codec => rubydebug }
}

Sample XML

<?xml version="1.0" encoding="UTF-8"?><element type="INSTANCE"><field1>001</field1><field2>ready</field2><field3>local</field3></element><element type="INSTANCE"><field1>002</field1><field2>pause</field2><field3>local</field3></element><element type="INSTANCE"><field1>003</field1><field2>starting</field2><field3>remote</field3></element>

Pretty version for viewing. The file is actually one long line, but some of the fields have embedded carriage returns.

<?xml version="1.0" encoding="UTF-8"?>
<element type="INSTANCE">
  <field1>001</field1>
  <field2>ready</field2>
  <field3>local</field3>
</element>
<element type="INSTANCE">
  <field1>002</field1>
  <field2>pause</field2>
  <field3>local</field3>
</element>
<element type="INSTANCE">
  <field1>003</field1>
  <field2>starting</field2>
  <field3>remote</field3>
</element>

Expected output

This will actually be written to Elasticsearch, but I want to verify I can make this work first

{
  "field1" => "001",
  "field2" => "ready",
  "field3" => "local"
},
{
  "field1" => "002",
  "field2" => "pause",
  "field3" => "local"
}
{
  "field1" => "003",
  "field2" => "starting",
  "field3" => "remote"
}

But all I am currently seeing is:
{
"field1" => "001",
"field2" => "ready",
"field3" => "local"
}


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.