Logstash pipeline output | duplicate messages ending up indexes

In the /etc/logstash/conf.d/ directory I have configured 2 pipeline files to read messages from kafka topic/s and send it to data nodes in the cluster. (1.conf and 2.conf)

[root@ingest1 conf.d]# egrep -w "topics|index" * | uniq
1.conf:        topics => ["events"]
1.conf:    index => "events-%{+YYYY.MM.dd}"
2.conf:        topics => ["input"]
2.conf:    index => "input-%{+YYYY.MM.dd}"

But if I produce a message to "events" topic, the message is ending up in 2 indexes. Same with other topic also.
Am I missing anything?

All files in the config directory are concatenation into a smaller neglected pipeline, meaning that data from all inputs go through all filters and are sent to all outputs unless you use conditionals to control the glow.

@Christian_Dahlqvist thanks for the quick response.
could you also guide me how to solve this?
ex: i want 1.conf to read "events" topic, apply pipeline filter and just output to its index.

TIA!

If the two configurations are completely separate from input to output I would strongly suggest using multiple pipelines. If there is overlap, or you are stuck on an old version then you can use something like

add_field => { inputTopic => "events" }

(with two different value for inputTopic) on the inputs to distinguish them, then use

output {
    if [inputTopic] == "events" {
        elasticsearch {
             ...
        }
    }
}

to send them to different end-points.

Even better, since you are using a kafka input, you can have the input decorate the metadata with the topic name and then make the output configuration conditional upon that.

2 Likes

@Badger thanks for this info!
I ended up sending all the topics data to one index per the request from Dev's for easy searching.
Notes about the multiple pipelines was also easy to understand and implement.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.