I would like to send some events to more than one ES cluster. I have following architecture:
LOG SOURCE -> LOGSTASH -> KAFKA -> LOGSTASH -> ES
The first logstash process logs and sends it into kafka. The second logstash just sends them to elasticsearch. I want to send some events to more than one ES. I have two options:
Store events from the first LS into one of two kafka topics (data_all, data_secondES), the second logstash will have 2 pipelines - the first pipeline reads both topics and stores data into first ES, the second pipeline (with different kafka consumer group) reads topic data_secondES and stores them into the second ES.
BUT this is not good because it in fact randomly prioritizes one of the topics - I do not know number of events for each topic so I set each topic for example 8 partitions. Lets say if in any time 99 percent of events are going to data_all, then the second logstash will first read all the events from data_secondES and data_all will still have some messages to process. This is because the partitions from all topics are equally assigned so lower number of messages in topic will be processed first.
Store events from the first LS into BOTH kafka topics (data_all, data_secondES). Each pipeline in the second logstash will read its topis so no problem with prioritization.
BUT I will have to store the events twice.
What option is better or is there something better I do not see?