Multiple instances of Logstash

Hello everyone,

I want to use Logstash with the "Twitter input" in order to track tweets about different themes. For example, I want to use one Logstash instance by theme with the following configurations :

Logstash's instance 1 :

input {
    twitter {
        consumer_key => "***************"
        consumer_secret => "***************"
        oauth_token => "*******************"
        oauth_token_secret => ""************"
        keywords => ["nature", "earth"]
        full_tweet => true
        codec => "json"
     }
}
filter {
        some actions
}
output { 
    elasticsearch {
         hosts => ["127.0.0.1:9200"]
         index => "nature_theme"
         document_type => "tweet"
         document_id => "%{id_str}"
     }
}

Logstash's instance 2 :

input {
    twitter {
        consumer_key => "***************"
        consumer_secret => "***************"
        oauth_token => "*******************"
        oauth_token_secret => ""************"
        keywords => ["bread", "honey"]
        full_tweet => true
        codec => "json"
     }
}
filter {
        some actions
}
output { 
    elasticsearch {
         hosts => ["127.0.0.1:9200"]
         index => "food_theme"
         document_type => "tweet"
         document_id => "%{id_str}"
     }
}

As you can notice, I want to index each tweet in the Elasticsearch index that correspond of its theme. By now, I have 18 themes, so I would like to execute 18 Logstash instances that will send the data in 18 different indices (one by theme). I'm not sure to have enough memory on my server to execute 18 instances of Logstash....

This is my actual configuration :

  • Total RAM : 8 Gb
  • 2 Elasticsearch instances : 2 Gb each
  • Elasticsearch queries take lot of cache : 2 Gb

So, I have only 1 Gb to execute my 18 Logstash instances, Do you think it's possible? Do you have other solutions?

Thank you in advance.

Why do you think you need to run an instance per "theme"?

Check out conditionals, that will help.

Do you think that this solution is better?

input {
    twitter {
        consumer_key => "***************"
        consumer_secret => "***************"
        oauth_token => "*******************"
        oauth_token_secret => ""************"
        keywords => ["nature", "earth"]
        full_tweet => true
        codec => "json"
        add_field => { "[@metadata][theme]" => "nature"}
     }
    twitter {
        consumer_key => "***************"
        consumer_secret => "***************"
        oauth_token => "*******************"
        oauth_token_secret => ""************"
        keywords => ["bread", "honey"]
        full_tweet => true
        codec => "json"
        add_field => { "[@metadata][theme]" => "food"}
     }
}
filter {
        some actions
}
output { 
   if [@metadata][theme] == "nature" {
        elasticsearch {
              hosts => ["127.0.0.1:9200"]
              index => "nature_theme"
              document_type => "tweet"
              document_id => "%{id_str}"
        }
 } else if [@metadata][theme] == "food" {
    elasticsearch {
              hosts => ["127.0.0.1:9200"]
              index => "food_theme"
              document_type => "tweet"
              document_id => "%{id_str}"
       }
    }
}

With the above configuration, I can have one Logstash instance with 18 Twitter input and 18 conditions for the indices, but I might have lot of input data from Twitter. Do you think that Logstash can process more than 5000 tweets / second?

Thank you in advance.

Depends on what resources the host has, test it and see.

If I use one Logstash instance with 18 Twitter inputs I get these issues :

[2017-02-17T14:04:35,373][WARN ][logstash.inputs.twitter  ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,374][WARN ][logstash.inputs.twitter  ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,374][WARN ][logstash.inputs.twitter  ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,376][WARN ][logstash.inputs.twitter  ] Twitter too many requests error, sleeping for 300s
[2017-02-17T14:04:35,410][WARN ][logstash.inputs.twitter  ] Twitter too many requests error, sleeping for 300s
...

For information, I set the below settings for the memory of the Logstash instance :
- Xms256m
- Xmx4g

Do you have any idea to solve this issue?

Thank you in advance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.