Logstash instance scaling

Ondro_Tadanai · October 11, 2017, 8:04am

Hi,
I am using logstash with logstash-twitter-input to scrape some tweets in real-time.
How to configure logstash scaling, if I want to run 2 instances (for example) how to effectively configure logstash instances to cooperate with each other effectively with twitter-input-plugin?
Thanks at all
Ondrej

Ondro_Tadanai · October 14, 2017, 5:55pm

Any help please?

magnusbaeck · October 15, 2017, 7:46pm

Logstash has no built-in support for getting two or more instances to talk to each other. If a single Logstash instance won't process the events fast enough I suggest you use one instance to just read the tweets and do nothing else except pass the events to any number of other instances, preferably using an in-between broker with a common queue for all reading instances.

Ondro_Tadanai · October 18, 2017, 1:52pm

Thanks for reply,
i have additional question,
if i have "main" logstash instance just to get tweets and send it to another logstash instances, there is single point of failure in my infrastructure with this "main" logstash.
If i launch 2 instances with same twitter configuration, how to set up these instances, i am afraid that both will write all tweets to broker (or redis for example), so every tweet will be there twice. How to put in the broker only tweets there are already not there?
Thanks at all
Ondrej

magnusbaeck · October 19, 2017, 10:00am

I would probably

make all Logstash instances pull tweets and push them to a broker,
write a small service that fetches from the broker and performs deduplication (i.e. drops messages it has seen before) and posts to another queue, and
configure Logstash to pull from that queue.

system · November 16, 2017, 10:01am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.