Merge data feeds to one searchable index


#1

i have ELK parsing our firewall logs - everything is working we are saving a ton of $$$ as we dont have to pay splunk to index the 6-8gb a day.

Now i want more, and add the blueliv threat feed to "enrich" my log data in order to create dashboards that will show any connections from my network - data from my logs containing ip adr with data in the blueliv feed containing malicious ips.

i know there is a issue about syncing that i have to watch out for, as the bluelive feed is on a hourly update and my fw log data is being streamed continuously. to get around this i will set the viz to update on an hourly rate.

how would i merge these essentially to seperate datasets/indexes is that even possible in ES or is this something i have to address in logstash
/ssi


(Zachary Tong) #2

There are a couple ways you could tackle this, but I would probably prefer to keep the data in separate indices. E.g. your logs go to logs-* while blueliv goes to blueliv-* (both on a time granularity that makes sense, probably daily).

Then in Kibana you can just add the blieliv to your index patterns. Once Kibana knows about the index, you can add it to dashboards just like your log data, create graphs with both, etc.

That should get you a fair amount of flexibility, depending on what you need to do. What won't work well in this approach is if you need to "enrich" your log data with data from the blueliv dataset. E.g. if you want to "tag" your logs with malicious/not_malicious based on the data in the blueliv data...for that, you'd need some kind of pre-processor that executes queries/aggs against blueliv and enriches the doc before indexing. So it'd be a little more complicated pipeline, depending on your needs.


#3

hi thanks for the quick reply.

the more i think about it, i think i might need to keep it all in one index. essentially i want the same functionality as I already have with the geoip data set that allows me to match an IP adress with a set of coordinates. The only difference here is that I want to see and possible match a specific IP in the logdata with a feed from blueliv. I might be the blueliv templated it self or even the plug-in I need to look at. I'm currently talking with blueliv that developed the plugin, but they are by their own admission not hard-core elasticsearch specialists - and nor am I.

would it be better/easier if the blueliv data was in a .dat file lige the geoip data set, that was then updated once every X hours by a cron job - like with the geoip data.

Also would I need to make edits in the any sort of elasticsearch/logstash template in kibana


(system) #4