Logstash connections are going to close_wait intermittently

Hi I'm using filebeat 5.1.2 version to send logs from around 430 instances to a central logstash-2.4.1 server over ssl and 443 port. I am seeing a scenario where sometimes when logstash is restarted, filebeat connections to logstash are going to close_wait state, CPU utilization of logstash is jumping from a mere 40% to around 400%. tcp connections over 443 port are constantly increasing till 32k. How ever this behavior is not consistent. After one or two restarts of logstash, this issue is not being visible, tcp connections over 443 are stabilized around 430. CPU utilization is coming back to 40% and logs are flowing well. I could see in filebeat logs the tcp connection to logstash failed due to i/o timeout, when logstash is exhibiting the above mentioned issue. Maintaining a stable pipeline is very crucial for my use case and due to this intermittent failures I couldn't trust the pipeline for real time alerting. Need help to fix this.

Following are the configs

Logstash beats input

input {
        beats {
        port => 443
        ssl => true
        ssl_certificate_authorities => ["/etc/logstash/certs/cacert_filebeat.pem"]
        ssl_certificate => "/etc/logstash/certs/ls-zs.pb.zscaleranalytics.net.crt"
        ssl_key => "/etc/logstash/certs/ls-zs.pb.zscaleranalytics.net.key.pem"
        ssl_verify_mode => "force_peer"

Filebeat output config

  hosts: ["ls-zs2.pb.zscaleranalytics.net:443"]
  ssl.certificate_authorities: ["/etc/filebeat/certs/cacert_logstash.pem"]
  ssl.certificate: "/etc/filebeat/certs/fbt-zs2.crt"
  ssl.key: "/etc/filebeat/certs/fbt-zs2.key"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.