Data enrichment

Rosanna_Staiano · May 2, 2024, 4:24pm

Hi everyone, we need some advice.

We have an index with Kafka data source. Our goal is to enrich this index with information present on elasticsearch, that is updated every day. We need to do enrichment in real time.

To do this, we created a Logstash pipeline that queries elasticsearch with the elasticsearch filter and we inserted the memcached filter to speed everything up and avoid unnecessary queries to elasticsearch.

Below is an example of the structure used for one of the enrichments:

 # enrichment from REGION registry data
 # I use the COD_REGN field as the correlation key
 # retrieve the DESC_REGN field

        memcached {
                id => "memcached_get_DESC_REGN"
                hosts => ["ip:port"]
                get => { "%{COD_REGN}_DESC_REGN" => "DESC_REGN"}
        }
    if ![DESC_REGN] {
                elasticsearch {
                        hosts => [ "https://hostname:9200"]
                        index => "index_name"
                        query => 'COD_REGN: "%{[COD_REGN]}"'
                        result_size => 1
                        fields => {
                                "DESC_REGN" => "DESC_REGN"
                        }
                        user => "user"
                        password => "PWD"
                        ca_file => "/path/to/file.pem"
                        tag_on_failure => [ "failed_anagrafica_region"]
                }
                memcached {
                        hosts => ["ip:port"]
                        set => { "DESC_REGN" => "%{COD_REGN}_DESC_REGN" }
                }
        }

Is there anything else we can do to optimize the pipeline? As we have a delay of 3 hours due to the numerous queries to elasticsearch and the large amount of data.

The elastic stack used is version 7.2

Thank you!!
Rosanna

Topic		Replies	Views
Enriching logs with data from Elasticsearch (logstash-filter-elasticsearch) Logstash	4	1677	May 3, 2017
Near Realtime Threat Intel enrichment using custom external sources stored on enrichment indexes Logstash	2	313	December 23, 2021
Enrichment via Elasticsearch Lookup Logstash	2	186	April 29, 2022
Elasticsearch filter plugin for data enrichment Elasticsearch	1	168	October 20, 2022
How to use memcached plugin in logstash Logstash	1	782	July 14, 2020

Data enrichment

Related topics