Logstash stop working silently (workers die out)

Hello,

At the customer site, we are using ElasticStack to gather syslogs from the network devices.
Logstash is receiving around 4000 syslogs/sec. We are using Logstash version 6.8.8.

The problem is that every 2-5 hours Logstasha workers start to die out until every worker dies and no more syslogs are received.

Screenshot at 11-37-52

Logstash is on a dedicated server, ElasticSearch Master + Kibana is one server, and we have 2
Elasticsearch node servers.

Logstash x 1
CPU: 4 x Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz
RAM: 8G
DISK: 20G`

Screenshot at 11-39-27

Kibana + ElasticSearch Master x 1
CPU: 4 x Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz
RAM: 8G
DISK: 20G
OS: CentOS

ElasticSearch Node x 2
CPU: 4 x Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz
RAM: 16G
DISK: 20G za OS i 800G za data
OS: CentOS

Screenshot at 11-40-20

more /etc/logstash/logstash.yml

pipeline.batch.size: 250
pipeline.batch.delay: 50
pipeline.unsafe_shutdown: true
pipeline.workers: 30
path.data: /var/lib/logstash
config.reload.automatic: true
config.reload.interval: 10s
path.logs: /var/log/logstash

more /etc/logstash/jvm.options

-Xms7096M -Xmx7096M
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-Djava.awt.headless=true
-Dfile.encoding=UTF-8
-Djruby.compile.invokedynamic=true
-Djruby.jit.threshold=0
-XX:+HeapDumpOnOutOfMemoryError
-Djava.security.egd=file:/dev/urandom

Can you please help me troubleshoot this issue?

Thank you.
Reagards

What is logged in the logstash logs when the worker threads die?

There is nothing consistent in the log. Very often there is nothing at all, sometimes there is "DNS: timeout on resolving address", sometimes"Received an event that has a different character encoding than you configured." but I don't think that those are related because there is no consistency.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.