In our stack, we have 12 logstash nodes. They work very fine when we restart all of them. After some time some nodes stop accepting data and some keeps going and processing. This way all of the nodes are getting down one by one. We were able to find these logs in
logstash-stderr.log file Logstash-stderr.log when last event got processed after that logstash service was running, it kept sending monitoring data to elastic but ingestion was stopped.
What factors should we look into this? Is this something because of some filter? we are not seeing anything in stdout logs. This started happening suddenly until then everything was working fine. Earlier ingestion rate was nearly 20k/s but now it starts from 10k/s gradually decreasing and then stops. The only workaround we found was to restart nodes.