I want to use Logstash with the "Twitter input" in order to track tweets about different themes. For example, I want to use one Logstash instance by theme with the following configurations :
As you can notice, I want to index each tweet in the Elasticsearch index that correspond of its theme. By now, I have 18 themes, so I would like to execute 18 Logstash instances that will send the data in 18 different indices (one by theme). I'm not sure to have enough memory on my server to execute 18 instances of Logstash....
This is my actual configuration :
Total RAM : 8 Gb
2 Elasticsearch instances : 2 Gb each
Elasticsearch queries take lot of cache : 2 Gb
So, I have only 1 Gb to execute my 18 Logstash instances, Do you think it's possible? Do you have other solutions?
With the above configuration, I can have one Logstash instance with 18 Twitter input and 18 conditions for the indices, but I might have lot of input data from Twitter. Do you think that Logstash can process more than 5000 tweets / second?
If I use one Logstash instance with 18 Twitter inputs I get these issues :
[2017-02-17T14:04:35,373][WARN ][logstash.inputs.twitter ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,374][WARN ][logstash.inputs.twitter ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,374][WARN ][logstash.inputs.twitter ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,376][WARN ][logstash.inputs.twitter ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,410][WARN ][logstash.inputs.twitter ] Twitter too many requests error, sleeping for 300s
...
For information, I set the below settings for the memory of the Logstash instance :
- Xms256m
- Xmx4g
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.