Logstash - Beats input: The circuit breaker has detected a slowdown


(Mike) #1

Hi all,

im trying to ship some file logs with this setup

filebeat(host1) -> logstash(host1) --+-> elasticsearch
                                     |
filebeat(host2) -> logstash(host2) --+
                                     |
filebeat(host3) -> logstash(host3) --+

and all of the logstash instances are reporting back this error

{:timestamp=>"2016-02-10T15:39:49.575000+0000", :message=>"retrying failed action with response code: 429", :level=>:warn}
{:timestamp=>"2016-02-10T15:39:49.577000+0000", :message=>"retrying failed action with response code: 429", :level=>:warn}
{:timestamp=>"2016-02-10T15:39:49.578000+0000", :message=>"retrying failed action with response code: 429", :level=>:warn}
{:timestamp=>"2016-02-10T15:39:50.547000+0000", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Beats input", :exception=>LogStash::Inputs::Beats::InsertingToQueueTakeTooLong, :level=>:warn}
{:timestamp=>"2016-02-10T15:39:50.548000+0000", :message=>"Beats input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::Inputs::BeatsSupport::CircuitBreaker::HalfOpenBreaker, :level=>:warn}
[...]
{:timestamp=>"2016-02-10T15:55:29.857000+0000", :message=>"Beats input: the pipeline is blocked, temporary refusing new connection.", :reconnect_backoff_sleep=>0.5, :level=>:warn}
{:timestamp=>"2016-02-10T15:55:30.358000+0000", :message=>"Beats input: the pipeline is blocked, temporary refusing new connection.", :reconnect_backoff_sleep=>0.5, :level=>:warn}

Things i've tried:

  • beef up elasticsearch cluster (now running 8cpus 8gb ram, 5 nodes)
  • beef up logstash ram (went up to 3g, now im back to 1g)
  • play with logstash settings (workers, batch size)
  • play with elasticsearch settings (fielddata.cache.size)
  • update all version of the cluster, now running latest ( logstash 2.2.0, filebeat 1.1.0, elasticsearch 2.2.0)

and now i've run out of ideas.
The size of the logs im trying to send are aprox 20 lines per 20 seconds and the average numbe of events per minute that kibana shows is 9k.

Any thoughts/ideas are more than welcomed

Cheers,

# logstash config
  1 input {                                                                         
  2     udp {                                                                       
  3         port => 25826                                                           
  4             buffer_size => 1452                                                 
  5             codec => collectd { }                                               
  6     }                                                                           
  7     beats {                                                                     
  8         port => 5044                                                            
  9     }                                                                           
 10 }     
# plus 10 grok filters manipulation  
#filebeat config
    -
      paths:
        - /var/log/upstart/mtr.log
      type: log
      fields:
        tag: mtr
     
  idle_timeout: 5s


(system) #2