I'm using Filebeat to read from a file & push data to Logstash. Logstash is, in turn, publishing the data to RabbitMQ.
But at times we stop receiving data in RabbitMQ. But the logstash service would be running and producing logs with no error messages. All the times this happened before, we had seen issues related to disk space or CPU utilisation, so I chalked it up to that. And restarting Filebeat+Logstash often resolved the issue.
But this time, the same thing happened again, but my disk space is sufficient & EC2 CPU metrics look good too. And restarting FB+LS resolved the issue again.
But this impacts the pipeline. So how do I resolve the issue and do an RCA on the same? (fyi, it's a legacy system and we've also got an nginx server running in the same instance)