Xml Parsing in logstash using xml & split plugins

I have a xml which looks like

<Countries><Country><Name>India</Name><ISDCode>+91</ISDCode><Continent>Asia</Continent><geolocationinfo><lattitude>123129</lattitude><longititude>7890890</longititude></geolocationinfo></Country><Country><Name>Srilanka</Name><ISDCode>+94</ISDCode><Continent>Asia</Continent><geolocationinfo><lattitude>1212349</lattitude><longititude>123890</longititude></geolocationinfo></Country></Countries>

All the content of the xml are in single line and we use filebeat to collect the data from log file and send it to logstash.

We need to store this xml as two document in Elasticsearch

doc1:
Name: India
ISDCode: +91
continent: Asia
geolocationinfo.Lattitude: 123129
geolocationinfo.longititude: 7890890

doc2:
Name: Srilanka
ISDCode: +94
continent: Asia
geolocationinfo.Lattitude: 1212349
geolocationinfo.longititude: 123890

Following is the logstash.conf that we have

input {
  beats {
    port => 5044
  }
}
filter
{
	xml
	{
		source => "message"
		target => "xml_content"
		store_xml => true
		xpath => 
		[            
            "/Countries/Country/Name/text()", "Name",
			"/Countries/Country/ISDCode/text()", "ISDCode",
			"/Countries/Country/Continent/text()", "Continent",
			"/Countries/Country/geolocationinfo/lattitude/text()", "geolocationinfo.lattitude",
			"/Countries/Country/geolocationinfo/longititude/text()", "geolocationinfo.longititude"
		]
		
	}
	split{
		field => 'Name'
		}
	}
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "test"
  }
}

What we get is two documents but the document contains only Name column and other columns are missing. I tried adding add_field also to split plugin but no luck.

The output from the xml filter will be an event which has several arrays which each have two entries. I cannot think of clean way of creating an event from the first entry in each of five array fields, and another event from the second entry in each of those field other than using ruby. It would be simpler to do

    xml {
        source => "message"
        target => "[@metadata][xml]"
        store_xml => true
        remove_field => [ "message" ]
    }
    split { field => "[@metadata][xml][Country]" }
    mutate {
        add_field => {
            "Name" => "%{[@metadata][xml][Country][Name][0]}"
            "ISDCode" => "%{[@metadata][xml][Country][ISDCode][0]}"
            "continent" => "%{[@metadata][xml][Country][Continent][0]}"
            "geolocation.latitude" => "%{[@metadata][xml][Country][geolocationinfo][0][lattitude][0]}"
            "geolocation.longitude" => "%{[@metadata][xml][Country][geolocationinfo][0][longititude][0]}"
        }
    }

If you do the split on the Country array then it keeps all five fields together in the same event and you can just add_field them into the right place.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.