We currently have a server with redis installed, and several logstash instances. Half of these instances listen on a port and output to redis, and half of them input from redis, filter, and output to elasticsearch. I have noticed that after a few days, the input stops working. Using redis-stat, I see that the redis db size keeps increasing, because the logs aren't being 'blpoped'. This is then causing the server to slow to a crawl because of memory usage, and then cluster instability inevitably results. I'm having trouble finding the cause of this error. The logs don't show anything, the .log files just show some java.lang.IllegalArgumentException: Invalid format error for the timestamp, but those go back several days. The redis input instances stopped working sometime between 4 PM yesterday and 10 AM today. Also, the last timestamp for the logs is Aug 20, at 3:35 AM. I know for a fact it was working most of the day yesterday. Where should I be looking? I see nothing out of the ordinary in the redis logs as well. There are some memory errors in the elasticsearch logs, but I'm not sure if that's part of the issue. If logstash is not able to output to elasticsearch, will it stop input as well?
I should note, I can get it working again by killing the logstash processes and restarting them. Then it seems to work fine. Also, the output to redis (different logstash instances) work fine the entire time.