Issues in searching the data in kibana. It is not working as expected. The situation is
The kind of logs which are collected are firewall logs
There are 2 fields Action and matches.action
Action specifies the end action taken by firewall and matches.action specifies the preliminary action taken. The values can be either one or more of (log, allow, drop, challenge, simulate)
Since I dont want 2 fields for the same kind of data. I merged matches.action to action which makes the field action an array.
When I'm filtering for a data it is showing what else is not required is also shown.
Eg: when searching for "drop" traffic it is showing "simulate" traffic also as well.
Refer the the image below
If you have an array of objects under one entity, it is recommended to map data in a nested way. Could you please show how does your mapping looks like for "action" field ?
I'm using logstash http_input plugin to ingest data to elastic search from a python script. I have a logstash if condition which does the merge operation which is as follows
Your logstash config looks good, but once your array type of data enters elasticsearch, it will convert the data into flat structure. Meaning, your relationship of the array's objects will be lost. That is the reason why you're getting the hits for match field when you're interested in match.action field and vice versa.
Try denormalising your data before you index it to elasticsearch, rather than keeping it as an array.
Arrays of objects do not work as you would expect: you cannot query each object independently of the other objects in the array. If you need to be able to do this then you should use the nested datatypes where in you can define your mapping for your usecase in a below way:
Eg. Create a mapping for your index where you map your action field to "action.second" and your preliminary action (matches.action) to action.first.
Note: if you're end goal is also to create visualisations/dashboards using this data, then you could avoid using these nested datatypes, as Kibana is yet to provide support for those. In that case, try denormalising your data.
I don't think that will work as expected. As when you try to use that field in queries by accessing it as "actions.actiontaken" , it wouldn't know which value to specifically consider.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.