Logstash enriching data with lookup

Hi guys,

I currently struggeling in understanding how this can be done within the elastic stack properly. I take netflow data as an input source in netflow. Specifically, I use elastiflow https://github.com/robcowart/elastiflow, thus I need to use Logstash. However, as I understand it, it might be possible to add an ingest pipeline on elasticsearch side to handle the kind of task I want to perform.

So, my router and everything sends data via netflow to logstash, which then sends it to elasticsearch and I see fancy stuff in Kibana.

On the other hand I have a fully managed network and I should know every device. I have a cmdb, which I can export as a csv, or can talk to as an API.

What I would like to do is the following:

  • Netflow is received
  • Mac, or Hostname, or IP is compared to the cmdb content.
    • if one of those match ==> add field "known: yes"
    • if none of those match ==> add field "known:false"

I do know how to manipulate event data, enrich it, extract messages and so on. I cannot figure out, how to load information from a file or an API inside the filter in Logstash. I know, that this might lead to increased CPU% or RAM and the bucket may take longer to be filled properly, depending on the lookups that have to be done. However, how am I even supposed to do this crossreference?

Thanks in advance for any idea.

You could use a translate filter. If you do not like that you could consider an elasticsearch, jdbc_streaming, http, or memcached filter.

Thanks! I will look into the translate feature. I just saw that this is capable of loading a csv from filesystem. I guess I have to do three translations, to properly match my idea :)!