I am pretty new to elasticsearch and I have a problem that I can't even solve using google. Therefore, I really appreciate any help.
I use a tool (log2timeline) to create an Index and add data to it. I am not able to alter this process. Log2timeline stores a xml structure in a field of type text called xml_string - here is the relevant part of the mapping:
I would like to search each XML tag like a elasticsearch field, because I use Kibana to analyze the data.
Therefore, I would like to just add an field-object to the index which contains the parsed xml object. I have figured out that I can alter the index by using _update_by_query. But I don't know how to convert the string to JSON automatically.
I found the logstash filter plugin xml, but to my understanding this can't be used in elasticsearch queries.
Is there any way to perform this task directly in elasticsearch?
For clarification purposes, I have added an example of the xml_string content:
thank you for the reply. I have looked into both suggestions.
elasticsearch-ingest-xml
As far as I understood, I would add this plugin to the elasticsearch ingestion pipeline.
Unfortunately, the gradle build fails (maybe it is too old):
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] FAILURE: Build failed with an exception.
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * Where:
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] Build file '/root/elasticsearch-ingest-xml/build.gradle' line: 17
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * What went wrong:
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] A problem occurred evaluating root project 'ingest-xml'.
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] > Failed to apply plugin [id 'carrotsearch.randomized-testing']
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] > Removing tasks from the task container is not supported. Disable the tasks or use replace() instead.
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * Try:
2020-06-27T07:28:42.367+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] Run with --stacktrace option to get the stack trace. Run with --scan to get full insights.
2020-06-27T07:28:42.368+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]
2020-06-27T07:28:42.368+0000 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * Get more help at https://help.gradle.org
2020-06-27T07:28:42.368+0000 [ERROR] [org.gradle.internal.buildevents.BuildResultLogger]
2020-06-27T07:28:42.368+0000 [ERROR] [org.gradle.internal.buildevents.BuildResultLogger] BUILD FAILED in 2s
FSCrawler also supports xml parsing
Could you elaborate on how to activate this for the field xml_string?
Yes, I tried to build it using gradle and the build instructions. Is there any repositories with prebuild versions? I am running version 7.8.0
Okay, then this will not work. Because this is just an xml string.
Just a general idea: I have taken a closer look at the program, that does the upload to elasticsearch. If I change that the field xml_string is not populated with an xml structure, but instead with a JSON structure does elasticsearch automatically detects this or do I need to adjust the mapping as well?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.