Storing logs containing several different XML types/schemas into ES

zenluiz · November 13, 2019, 6:59pm

Hi,
I have millions of lines of logs that I want to index in ES, using Filebeat + Logstash.
Each log line can have a different XML that represents a SOAP message that has been received by a WebService and logged, like in this example:

2019-11-13 10:00:1234 <?xml><Envelope><Body><GetUnitByPosition><Position>1234</Position></GetUnitByPosition></Body></<Envelope>
2019-11-12 09:30:5678 <?xml><Envelope><Body><GetPositionByName><Name>Position1</Name></GetPositionByName></Body></<Envelope>
2019-11-11 08:30:5678 <?xml><Envelope><Body><UpdatePosition><Position>1234567</Position><Name>Position2</Name><Tag>9876</Tag></UpdatePosition></Body></<Envelope>

In the real case, there are around 100 possible XML messages received by this web service, so I wanted to make the logs easier to read by storing the XML, using store_xml => true option in the xml filter in Logstash. Also, I am using XPath to extract every single possible element from the XMLs (yes, a heck of a tedious job) into separate fields.

Problems are:

Creating XPath for each element is too much work, but more than that, if there is any change on any of the XML schemas (or new elements), I would need to change the logstash config as well. I would like to avoid that.
To avoid that, I thought on removing the XPaths and work with the fields that the XML filter plugin generates. However, doing that brings another problem: since there are too many combinations / elements in XMLs, I get many error messages saying the limit of 1000 fields has been reached. That's because it generates the following fields for the example I gave above:

xmldata.Envelope.Body.GetUnitByPosition.Position
xmldata.Envelope.Body.GetPositionByName.Name
xmldata.Envelope.Body.UpdatePosition.Position
xmldata.Envelope.Body.UpdatePosition.Name
xmldata.Envelope.Body.UpdatePosition.Tag

So, the question is: does anyone have any experience with a situation similar to this one? Any recommendation? Maybe I am missing something, a configuration to limits, etc.

Thanks!

system · December 11, 2019, 6:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
XML filter Logstash	3	889	July 6, 2017
Xml filter: create filter definition based on xsd Logstash	9	1413	August 23, 2018
Struggling to parse XML using Logstash Logstash	1	279	October 13, 2020
In what format shoould i index xml data in elasticsearch Logstash	5	2730	April 7, 2017
Store XML to Index in ES using logstash Logstash	11	2917	April 13, 2018

Storing logs containing several different XML types/schemas into ES

Related topics