So far I've only experience with indexing csv-files with Logstash into Elasticsearch. Now I have a couple of xml-files which I want to index and fortunately it was not that difficult. With the most basic conf I've managed to get the data in Elasticsearch:
The tag VINAnalysis encloses the whole xml-file so I've used that one as my source. When I look at the data in Kibana I see that Logstash has indexed all xml-tags. I want to get rid of those because I don't want them searchable.
I thought I can remove those tags with the remove_tag option and one of the XML-tags is
<SystemInfo>data: data</Systeminfo>.
I've added it to my conf which you can see above but the tag is still being indexed. What am I doing wrong?
Perhaps it would be a better idea to use the xpath option to selectively save things you do want to save, instead of extracting everything and ripping out the boring stuff?
remove_tag => [ "%{SystemInfo}"]
There are several reasons why this doesn't work.
"Tags" in Logstash have nothing to do with XML tags in parsed XML documents.
You should use use the %{foo} notation when you want to expand the contents of a field. In this case you want to reference a field by name.
The SystemInfo field is a nested field so you need to access it via e.g. [VINAnalysis][SystemInfo] or whatever the structure looks like.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.