Problem in parsing XML files using logstash?

Hi,

My xml file is like this:

 <note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

My logstash configuration:

input {
      file {
            path => "C:\Users\Desktop\sample.xml"
            type => "xml"
            start_position => "beginning"
            sincedb_path => "NUL"
	    codec => multiline {
            pattern => "^<\?note.*\>"
            negate => true
            what => "previous"
        }
  }

}
filter {
    xml {
   source => "xml"
   xpath => 
   [
     "/xml/to/text()", "To",
     "/xml/from/text()", "From",
     "/xml/heading/text()", "Heading",
     "/xml/body/text()", "Body"
   ]
   store_xml => "true"
}
}
output {
stdout { codec => rubydebug }
}

Error is like this:

←[31mPipeline aborted due to error {:exception=>"LogStash::ConfigurationError",
:backtrace=>["D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-f
ilter-xml-2.2.0/lib/logstash/filters/xml.rb:106:in `register'", "D:/logstash/log
stash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/p
ipeline.rb:182:in `start_workers'", "org/jruby/RubyArray.java:1613:in `each'", "
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java
/lib/logstash/pipeline.rb:182:in `start_workers'", "D:/logstash/logstash-2.4.0/v
endor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/pipeline.rb:13
6:in `run'", "D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-c
ore-2.4.0-java/lib/logstash/agent.rb:491:in `start_pipeline'"], :level=>:error}←
[0m
stopping pipeline {:id=>"main"}
The signal HUP is in use by the JVM and will not work correctly on this platform

Thanks

You need to specify the target option to the xml filter.

https://www.elastic.co/guide/en/logstash/current/plugins-filters-xml.html#plugins-filters-xml-target

Unrelated, but I don't see how your XPath can be correct. You don't have any "xml" elements, but you do have a "note" element.

Thanks @magnusbaeck

Changed the logstash configuration based on your suggestions like this:

 input {
      file {
            path => "C:\Users\571952\Desktop\sample.xml"
            start_position => "beginning"
            sincedb_path => "NUL"
	    codec => multiline {
               pattern => "^<\?note.*\>"
               negate => true
               what => "previous"
        }
  }

}
filter {
    xml {
   source => "message"
   xpath => 
   [
     "/note/to/text()", "To",
     "/note/from/text()", "From",
     "/note/heading/text()", "Heading",
     "/note/body/text()", "Body"
   ]
   store_xml => true
   target => "doc"
        }
    }
output {
 stdout { codec => rubydebug }
       }

Output is not proper:

Settings: Default pipeline workers: 4
Pipeline main started
←[33mSIGINT received. Shutting down the agent. {:level=>:warn}←[0m
stopping pipeline {:id=>"main"}
←[33mError parsing xml with XmlSimple {:source=>"message", :value=>"<note>\r\n<t
o>Tove</to>\r\n<from>Jani</from>\r\n<heading>Reminder</heading>\r\n<body>Don't f
orget me this weekend!</body>\r\n</note>\r\n<note>\r\n<to>Tove</to>\r\n<from>Jan
i</from>\r\n<heading>Reminder</heading>\r\n<body>Don't forget me this weekend!</
body>\r", :exception=>#<REXML::ParseException: #<RuntimeError: attempted adding
second root element to document>
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/document.rb:94:in `ad
d'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/element.rb:882:in `ad
d'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/child.rb:21:in `initi
alize'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/parent.rb:13:in `init
ialize'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/element.rb:59:in `ini
tialize'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/element.rb:880:in `ad
d'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/element.rb:297:in `ad
d_element'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/document.rb:101:in `a
dd_element'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/parsers/treeparser.rb
:33:in `parse'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/document.rb:249:in `b
uild'
D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/document.rb:43:in `in
itialize'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/lib/xml
simple.rb:971:in `parse'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/lib/xml
simple.rb:164:in `xml_in'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/lib/xml
simple.rb:203:in `xml_in'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-filter-xml-2.2.
0/lib/logstash/filters/xml.rb:186:in `filter'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java
/lib/logstash/filters/base.rb:151:in `multi_filter'
org/jruby/RubyArray.java:1613:in `each'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java
/lib/logstash/filters/base.rb:148:in `multi_filter'
(eval):42:in `filter_func'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java
/lib/logstash/pipeline.rb:267:in `filter_batch'
org/jruby/RubyArray.java:1613:in `each'
org/jruby/RubyEnumerable.java:852:in `inject'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java
/lib/logstash/pipeline.rb:265:in `filter_batch'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java
/lib/logstash/pipeline.rb:223:in `worker_loop'
D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java
/lib/logstash/pipeline.rb:201:in `start_workers'
...
attempted adding second root element to document
Line: 7
Position: 132
Last 80 unconsumed characters:
>, :backtrace=>["D:/logstash/logstash-2.4.0/vendor/jruby/lib/ruby/1.9/rexml/pars
ers/treeparser.rb:95:in `parse'", "D:/logstash/logstash-2.4.0/vendor/jruby/lib/r
uby/1.9/rexml/document.rb:249:in `build'", "D:/logstash/logstash-2.4.0/vendor/jr
uby/lib/ruby/1.9/rexml/document.rb:43:in `initialize'", "D:/logstash/logstash-2.
4.0/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/lib/xmlsimple.rb:971:in `parse
'", "D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/xml-simple-1.1.5/li
b/xmlsimple.rb:164:in `xml_in'", "D:/logstash/logstash-2.4.0/vendor/bundle/jruby
/1.9/gems/xml-simple-1.1.5/lib/xmlsimple.rb:203:in `xml_in'", "D:/logstash/logst
ash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-filter-xml-2.2.0/lib/logstash/fi
lters/xml.rb:186:in `filter'", "D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1
.9/gems/logstash-core-2.4.0-java/lib/logstash/filters/base.rb:151:in `multi_filt
er'", "org/jruby/RubyArray.java:1613:in `each'", "D:/logstash/logstash-2.4.0/ven
dor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/filters/base.rb:
148:in `multi_filter'", "(eval):42:in `filter_func'", "D:/logstash/logstash-2.4.
0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/pipeline.rb
:267:in `filter_batch'", "org/jruby/RubyArray.java:1613:in `each'", "org/jruby/R
ubyEnumerable.java:852:in `inject'", "D:/logstash/logstash-2.4.0/vendor/bundle/j
ruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/pipeline.rb:265:in `filter_b
atch'", "D:/logstash/logstash-2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2
.4.0-java/lib/logstash/pipeline.rb:223:in `worker_loop'", "D:/logstash/logstash-
2.4.0/vendor/bundle/jruby/1.9/gems/logstash-core-2.4.0-java/lib/logstash/pipelin
e.rb:201:in `start_workers'"], :level=>:warn}←[0m
{
    "@timestamp" => "2017-06-20T13:30:33.300Z",
       "message" => "<note>\r\n<to>Tove</to>\r\n<from>Jani</from>\r\n<heading>Re
minder</heading>\r\n<body>Don't forget me this weekend!</body>\r\n</note>\r\n<no
te>\r\n<to>Tove</to>\r\n<from>Jani</from>\r\n<heading>Reminder</heading>\r\n<bod
y>Don't forget me this weekend!</body>\r",
      "@version" => "1",
          "tags" => [
        [0] "multiline",
        [1] "_xmlparsefailure"
    ],
          "path" => "C:\\Users\\571952\\Desktop\\sample.xml",
          "host" => "PC326815",
            "To" => [
        [0] "Tove"
    ],
          "From" => [
        [0] "Jani"
    ],
       "Heading" => [
        [0] "Reminder"
    ],
          "Body" => [
        [0] "Don't forget me this weekend!"
    ]
}
Pipeline main has been shutdown
The signal HUP is in use by the JVM and will not work correctly on this platform

Thanks

You're feeding more than one XML document to the filter.

Thanks @magnusbaeck . Means you are saying i cant merge and send bulk xml files to logstash right?

Suppose if i have xml file like this:

<root id="XYZ">
  <users>
    <someone id="john.doe" type="human"/>
      <priorities>
        <priority name="high"/>
      </priorities>
      <cities>
        <city name="London"/>
        <city name="Paris"/>
        <city name="Rome"/>
      </cities> 
    </someone>
    <someone id="hal.9001" type="machine"/>
      <priorities>
        <priority name="low"/>
      </priorities>
      <cities>
        <city name="Jupiter"/>
        <city name="Paris"/>
      </cities> 
    </someone>
  </users>
</root>

I want my output in this format:

"someoneId" => "john.doe"
"rootId" => "XYZ"
"priorityName" => "high"
"cities" => [
  [0] "London"
  [1] "Paris"
  [2] "Rome"
],
"someoneId" => "hal.9001"
"rootId" => "XYZ"
"priorityName" => "low"
"cities" => [
  [0] "Jupiter"
  [1] "Paris"
]

Is this possible in logstash?

Thanks

Means you are saying i cant merge and send bulk xml files to logstash right?

I'm just saying that the string you're passing to the xml filter contains two XML documents (two consecutive "note" elements). The xml filter can only parse one document at a time.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.