XML Parse Returns an Empty Array

I have the following XML file:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <ChainId>7290027600007</ChainId>
    <SubChainId>001</SubChainId>
    <StoreId>001</StoreId>
    <BikoretNo>9</BikoretNo>
    <DllVerNo>8.0.1.3</DllVerNo>
</root>

My conf file is:

input {
  file {
    path => "/usr/share/logstash/logs/example1.xml"
    type => "xml"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => multiline {
      pattern => "<?xml version"
      negate => true
      what => "previous"
    }
  }
}

filter {
    xml {
        source => "message"
        store_xml => false
        xpath => [ "/root/ChainId/text()", "ChainId" ]
    }
}

output {
  elasticsearch {
    hosts => "elasticsearch:9200" # it used to be "host" and "port" pre-2.0
    index => "xml_index"
    manage_template => false
    #protocol => "http" # removed in 2.0
    #port => "443" # removed in 2.0
  }

  stdout { 
    codec => rubydebug 
  }
}

My Logstash output:

{
logstash_1       |           "path" => "/usr/share/logstash/logs/example1.xml",
logstash_1       |       "@version" => "1",
logstash_1       |           "type" => "xml",
logstash_1       |        "message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<root>\r\n    <ChainId>7290027600007</ChainId>\r\n    <SubChainId>001</SubChainId>\r\n    <StoreId>001</StoreId>\r\n    <BikoretNo>9</BikoretNo>\r\n    <DllVerNo>8.0.1.3</DllVerNo>\r",
logstash_1       |           "host" => "751b3a8bf341",
logstash_1       |        "ChainId" => [],
logstash_1       |           "tags" => [
logstash_1       |         [0] "multiline"
logstash_1       |     ],
logstash_1       |     "@timestamp" => 2019-03-24T20:15:11.278Z
logstash_1       | }

I read a lot of posts about XML parsing on Logstash, but I still can't make it work. I don't understand what am I doing worng

P.S.
I also tried to remove spaces and \r\n from XML using gsub, but without success.

mutate {
  gsub => [...]
}

Unable to reproduce. If I use your file and file input as-is, then I never see an event. If I add "auto_flush_interval => 2" then your filter produces

   "message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<root>\n    <ChainId>7290027600007</ChainId>\n    <SubChainId>001</SubChainId>\n    <StoreId>001</StoreId>\n    <BikoretNo>9</BikoretNo>\n    <DllVerNo>8.0.1.3</DllVerNo>\n</root>",
   "ChainId" => [
    [0] "7290027600007"
],

In your output [message] does not include </root>, which suggests you are actually using

pattern => "^</root>" what => "previous" negate => true

which results in invalid XML, but xpath still pulls the ChainId out of it

   "ChainId" => [
    [0] "7290027600007"
],

Can you double check that you posted matching configuration and output?

I checked again and those are my exact configuration. I'm running latest ELK stack 6.6 on deviantony/docker-elk image.

My output does include . I'm attaching my last output again:

{
logstash_1       |     "@timestamp" => 2019-03-26T06:45:27.941Z,
logstash_1       |           "tags" => [
logstash_1       |         [0] "multiline"
logstash_1       |     ],
logstash_1       |           "host" => "751b3a8bf341",
logstash_1       |        "ChainId" => [],
logstash_1       |        "message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<root>\r\n    <ChainId>7290027600007</ChainId>\r\n    <SubChainId>001</SubChainId>\r\n    <StoreId>001</StoreId>\r\n    <BikoretNo>9</BikoretNo>\r\n    <DllVerNo>8.0.1.3</DllVerNo>\r\n</root>\r",
logstash_1       |           "path" => "/usr/share/logstash/logs/example1.xml",
logstash_1       |       "@version" => "1",
logstash_1       |           "type" => "xml"
logstash_1       | }

May the problem is within the XML transformation to string inside message resulting \r \ n and escaping all document?

You are correct, sir! You are, judging by the file path, running on UNIX, but your file has Windows style newlines. The xml filter doesn't care about newlines in the string, but on UNIX it expect UNIX newlines, not Windows newlines. So you need to strip out the \r, which is Ctrl/M. You can type this using Ctrl/V Ctrl/M...

    mutate { gsub => [ "message", "^M", "" ] }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.