I'm not familiar with Elasticsearch Ingest Node, so far from browsing the documentation it seems what I need is the Append Processor. I am not familiar with how to split field value by path element separator, and then get the individual parts.
It seems I also have to do a little bit logic to exclude other data from this treatment, we have other logs which does not follow the path convention (which is why use tags like "webserver", "tomcat", etc.), and I'm not sure how to do that with this "pipeline" thing.
A small example would be nice, thank you.
Anyway, if it is too convoluted we might have to bite the bullet and run logstash (something that I want to avoid).
With this configuration, all events from first prospector will be send to the analyze_source pipeline and all events from second prospector will not be send to any ingest pipeline.
in elasticsearch you can try the grok filter. It's basically regular expressions on steroids, supporting 'templates', extracing fields, and converting strings to (e.g. numeric) types. The grok filter even supports multiple patterns, in case you have different schemas. Ingest node also has some 'failure' handling in case of content being unparseable by grok. You can use this one as well and always send all events to the pipeline. I recommend to test in the kibana console using the simulate API. If you really need something custom, you can use the script processor with painless([1], [2], [3]).
I don't see how Append processor is what you need. If grok is not doing the trick, you can try the split processor + script to assign the individual fields (on the other hand I think painless let's you use some JAVA API, that is you can use split) .
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.