Problem parsing large XML with logstash?

Logstash 2.4.1

I want to parse a XML file :

   <?xml version="1.0" encoding="ISO-8859-1"?>
    <catalog>
      <cd country="USA">
        <title>Empire Burlesque</title>
        <artist>Bob Dylan</artist>
        <price>10.90</price>
      </cd>
      <cd country="UK">
        <title>Hide your heart</title>
        <artist>Bonnie Tyler</artist>
        <price>10.0</price>
      </cd>
      <cd country="USA">
        <title>Greatest Hits</title>
        <artist>Dolly Parton</artist>
        <price>9.90</price>
      </cd>
</catalog>

I want output in this format :

  "country" => "USA" ,
     "title" => [
    [0] "Empire Burlesque"
],
    "artist" => [
    [0] "Bob Dylan"
],
     "price" => [
    [0] "10.90"
],
      "country" => "UK" ,
     "title" => [
    [0] "Hide your heart"
],
    "artist" => [
    [0] "Bonnie Tyler"
],
     "price" => [
    [0] "10.0"
],
   so on .....

But what i got is like this :

 "country" => [
    [0] "USA",
    [1] "UK",
    [2] "USA"
],
     "title" => [
    [0] "Empire Burlesque",
    [1] "Hide your heart",
    [2] "Greatest Hits"
],
    "artist" => [
    [0] "Bob Dylan",
    [1] "Bonnie Tyler",
    [2] "Dolly Parton"
],
     "price" => [
    [0] "10.90",
    [1] "10.0",
    [2] "9.90"
]

My logstash configuration is like this :

input {
      file {
            path => "F:\logstash-2.4.0\logstash-2.4.0\bin\samplexml.xml"
            start_position => "beginning"
            sincedb_path => "NUL"
	    codec => multiline {
               pattern => "^<\?cd.*\>"
               negate => true
               what => "previous"
        }
  }

}
filter {
    xml {
   source => "message"
   xpath => 
   [ 
     "/catalog/cd/@country", "country",
     "/catalog/cd/title/text()", "title",
     "/catalog/cd/artist/text()", "artist",
     "/catalog/cd/price/text()", "price"
   ]
   store_xml => false
   target => "doc"
        }
    }
output {
 stdout { codec => rubydebug }
  }

How can i achieve the my desired output from the above xml file?

Thanks

In the output it is merging all the events in one field but what i want is something like this:

In your example there are multiple fields with the same name. That's not possible to have so I don't know what you want things to look like.

Thanks @magnusbaeck

I updated my question with a small example hope this will get you clear on my requirement.[quote="magnusbaeck, post:2, topic:90122"]
In your example there are multiple fields with the same name
[/quote]

SO, You if multiple fields has same name i can't able to divide it based on "country".

Thanks

I updated my question with a small example hope this will get you clear on my requirement.

No, that example has the same problem. I'm guessing you want this (expressing it as JSON):

"some_field": [
  {
    "country": "USA",
    "title": "Empire Burlesque",
    ...
  },
  {
    "country": "UK",
    "title": "Hide your heart",
    ...
  },
  ...
]

In other words an array of objects, each object representing an album.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.