I am using logstash with logstash-twitter-input to scrape some tweets in real-time.
How to configure logstash scaling, if I want to run 2 instances (for example) how to effectively configure logstash instances to cooperate with each other effectively with twitter-input-plugin?
Thanks at all
Any help please?
Logstash has no built-in support for getting two or more instances to talk to each other. If a single Logstash instance won't process the events fast enough I suggest you use one instance to just read the tweets and do nothing else except pass the events to any number of other instances, preferably using an in-between broker with a common queue for all reading instances.
Thanks for reply,
i have additional question,
if i have "main" logstash instance just to get tweets and send it to another logstash instances, there is single point of failure in my infrastructure with this "main" logstash.
If i launch 2 instances with same twitter configuration, how to set up these instances, i am afraid that both will write all tweets to broker (or redis for example), so every tweet will be there twice. How to put in the broker only tweets there are already not there?
Thanks at all
I would probably
- make all Logstash instances pull tweets and push them to a broker,
- write a small service that fetches from the broker and performs deduplication (i.e. drops messages it has seen before) and posts to another queue, and
- configure Logstash to pull from that queue.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.