Logstash dies after about 8 hours

Hi everyone,
I am new to ELK and i am running ELK on a centos 7 VM with about 16GB of Mem and 6virtual sockets with 2 per core.
Logstash 2.2.2 Elasticsearch 2.2.1 and Kibana 4.4.2 are all on the latest versions.
Everything works fine for about 8-12 hours then logstash dies when i check the status it's Active (exited). If i restart logstash again it will run for about another 8 hours.
Looking at the logs i see the following

{:timestamp=>"2016-03-20T00:04:24.357000-0400", :message=>"Connection pool shut down", :class=>"Manticore::ClientStoppedException", :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.5.2-java/lib/manticore/response.rb:37:in initialize'", "org/jruby/RubyProc.java:281:incall'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.5.2-java/lib/manticore/response.rb:79:in call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.5.2-java/lib/manticore/response.rb:256:incall_once'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.5.2-java/lib/manticore/response.rb:153:in code'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/http/manticore.rb:71:inperform_request'", "org/jruby/RubyProc.java:281:in call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/base.rb:201:inperform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/http/manticore.rb:54:in perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/sniffer.rb:32:inhosts'", "org/jruby/ext/timeout/Timeout.java:147:in timeout'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/sniffer.rb:31:inhosts'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.15/lib/elasticsearch/transport/transport/base.rb:76:in reload_connections!'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.5.1-java/lib/logstash/outputs/elasticsearch/http_client.rb:72:insniff!'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.5.1-java/lib/logstash/outputs/elasticsearch/http_client.rb:60:in start_sniffing!'", "org/jruby/ext/thread/Mutex.java:149:insynchronize'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.5.1-java/lib/logstash/outputs/elasticsearch/http_client.rb:60:in start_sniffing!'", "org/jruby/RubyKernel.java:1479:inloop'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.5.1-java/lib/logstash/outputs/elasticsearch/http_client.rb:59:in `start_sniffing!'"], :level=>:error}

Looks like an issue with elasticsearch output plugins (can you show us your logstash configuration?), try the no-java output plugin maybe.

Here is my logstash config, however when you say try the no-java output plugin what do you mean?

input {
##################################################################
#Port 5044: Filebeat (with TLS)
##################################################################
beats {
type => "logs"
port => 5044
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
##################################################################
#Port 5045: Packetbeat (with TLS)
##################################################################
beats {
port => 5045
ssl=> "true"
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
}
##################################################################
#Port 5043: Topbeat (with TLS)
##################################################################
beats {
port => 5043
ssl=> "true"
ssl_certificate=>"/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key=>"/etc/pki/tls/private/logstash-forwarder.key"
}
}

filter {
  if [fields][log_type] == "syslog" {
     grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  geoip {
        source => "src_ip"
        target => "geoip"
        add_field => ["[geoip][coordinates]","%{[geoip][longitude]}"]
        add_field => ["[geoip][coordinates]","%{[geoip][latitude]}"]
    }
    mutate {
        convert => [ "[geoip][coordinates]", "float" ]
    }
    geoip {
        source => "host"
        target => "geoip"
        add_field => ["[geoip][coordinates]","%{[geoip][longitude]}"]
        add_field => ["[geoip][coordinates]","%{[geoip][latitude]}"]
    }
    mutate {
        convert => [ "[geoip][coordinates]", "float" ]
    }
 #####################################################
 #For all: remove certain items from [tags]
 #####################################################
  if "_grokparsefailure" in [tags] {
    mutate {
      remove_tag => "_grokparsefailure"
   }
}
  if "_jsonparsefailure" in [tags] {
    mutate {
      remove_tag => "_jsonparsefailure"
   }
}
  if "_jsonparsefailure_grokparsefailure" in [tags] {
    mutate {
      remove_tag => "_jsonparsefailure_grokparsefailure"
   }
}
  if [fields][log_type] == "yum" {
   grok {
        match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{DATA:yum_event}\: %{GREEDYDATA:yum_package}" }
        add_tag => [ "yum_events" ]
    }
}  #Endif yum
  } #Endif syslog
}  #Endif filter

output {
stdout {codec => rubydebug}
elasticsearch {
hosts => ["127.0.0.1:9200"]
sniffing => true
}
}

I believe the error you have is when logstash ES plugin wants to retrieve the list of ES cluster nodes (aka sniffing), I can see a HTTP timeout.

Try to disable this feature should help (you can hardcode list of ES nodes in logstash configuration).

But I agree, this is a bug, logstash should not exit for a such timeout error.

1 Like

Thanks, will try it out and let you know.

Thanks Thomas, removing the sniffing did the trick. Logstash has been up since.

I had the same problem (even the enviroment is almost the same with the Beats Ports)

As @snipervelli said, after removing sniffing LS has been up in my own enviroment.

Thanks for the info!