Parsing xml document using xpath

ushadatt · September 15, 2015, 1:51pm

I am trying to parse the following xml data using logstash.. I am able to do it for a single document..But when I am increasing number of documents, its not working..

<Book:Body>
    <Book:Head>
        <bookname>Book:Name</bookname>
            <ns:Hello xmlns:ns="www.example.com">
                <ns:BookDetails>
                    <ns:ID>123456</ns:ID>
                    <ns:Name>ABC</ns:Name>
                </ns:BookDetails>
			</ns:Hello xmlns:ns="www.example.com">
    </Book:Head>
<Book:Body>

My config file is as given:

multiline 	{
                       pattern =>  "<Book:Body>"
                        what => "previous"
			negate => "true"
			}
				
                xml {
                        store_xml => "false"
                        source => "message"
			remove_namespaces => "true"
						
                        xpath =>[
                                "/Book/Book/BookDetails/ID/text()","UUID",	
				"/Book/Book/BookDetails/Name/text()","Name"
					]
			}
               
                mutate {
                        add_field => ["IDIndexed", "%{ID}"]
			add_field => ["NameIndexed", "%{Name}"]
				}

magnusbaeck · September 15, 2015, 2:09pm

Could you be a bit more specific than "it's not working"? What do you get? Is there anything interesting in the logs? What do you mean by "multiple documents", multiple consecutive Book:Body elements in the same file...?

ushadatt · September 16, 2015, 4:49am

Yes, I mean multiple consecutive Book:Body elements in the same file.. With just one entry like my example, it is parsing the two fields ID and Name and mutate filter is adding new fields..But with multiple records, it is not able to parse the message and the fields %{ID} and %{Name} appear as it is without any values..Is there something wrong with my multiline pattern or xpath?

magnusbaeck · September 16, 2015, 6:00am

The multiline pattern looks okay. I suggest you simplify things by removing the xml filter and just emitting messages with the joined XML lines. What happens then if you feed Logstash a file with multiple Book:Body elements?

BTW, your example Book:Body element ends with <Book:Body> rather than </Book:Body>. I assume that was a typo?

ushadatt · September 16, 2015, 6:19am

Actually I have tried the example again without namespace ns, so logstash was able to parse the document, but when I am using ns in all the tags as given in the BOOK:Body elements, it is not parsing it.. I have even used the remove_namespaces tag in the xml filter.. I guess the problem is due to namespace of XML tags.. I was working with this example without namespaces:

    <Book>
        <bookname>Book:Name</bookname>
            <Hello>
                <BookDetails>
                    <ID>123456</ID>
                    <Name>ABC</Name>
                <BookDetails>
	</Hello>
 </Book>

Yeah sorry for the typo! </Book:Body>

Navneet_Mathpal · September 16, 2015, 11:23am

+1 getting the same issue ( remove_namespaces => true # not removing the name spaces)

Topic		Replies	Views
Parsing multiline logs : line + xml Logstash	10	6652	July 6, 2017
Logstash XML Parsing Using multiline Logstash	1	1106	July 6, 2017
Mutiple xml file with same fields (How to handle?) any suggestion Logstash	4	861	July 6, 2017
Need help with parsing multiline XML file Logstash	4	2894	July 6, 2017
Can i use file filter for xml docs Logstash	12	2877	July 6, 2017

Parsing xml document using xpath

Related topics