Hi Team,
I'm a beginner to logstash. Till now worked on configuring basic Input, Filebeats and Output plugins by following product documents which are available in the official website.
Now I have a real time scenario below,
I have an unstructured log file (abc.log) which is stored in the local folder, and it is the combination of Junk info and XML response in between. I need to extract only XML response from the log file through logstash pipeline.
Please guide me to achieve this.
Appreciate the quick response!
If there is valid XML in the middle of a field an xml filter with store_xml set to true will find it and parse it. (xpath on the other hand will reject it.)
input { generator { count => 1 lines => [ '2020/01/02 08:40:16 here is some XML: <a><b>1</b><c>2</c></a> and more stuff' ] } }
filter { xml { source => "message" target => "theXML" } }
output { stdout { codec => rubydebug { metadata => false } } }
I have small difficulty in my log file which is not extracting XML response. Below is the sample example of my log file.
2020/01/02 08:40:16 here is some XML:
<a>
<b>1</b>
<c>2</c>
</a>
and more stuff
2020/01/02 08:40:16 here is some XML:
2020/01/02 08:40:16 here is some XML:
2020/01/02 08:40:16 here is some XML:
2020/01/02 08:40:16 here is some XML:
<a>
<b>1</b>
<c>2</c>
</a>
Above log file contains junk in between and XML response in separate lines.
When i combine XML in one line like (<a><b>1</b><c>2</c></a>), then the output is coming as like you mentioned, otherwise logstash is considering the single line as single event in the output.
It is very difficult to combine all the XML tags in single line manually since the log file is huge and having more number of XML responses in between.
Please guide me to achieve extracting XMLs from the above log file.
Thanks Badger, It got worked well.
Small clarification. Why <a> tag is not showing in 'theXML' object output. If we want to show that also, how can we bring that in 'theXML' object output.
Understood. But my question here is i have two different XML objects in the log file.
One is <Request> and other is <Response>. So i have to differentiate both XML objects.
Please guide me if there is any other way to bring in.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.