I have logstash container deployed in ECS, with a hard memory limit if 1.5GB, pipeline config is already tested locally so I'm fairly sure that's not the problem. I'm sending container logs to cloudwatch, and there is no error there.
Logstash version: 7.11.0
I'm not collecting any logs yet, so logstash is just sitting there idle.
I know for sure logstash is started correctly because I can see it created ILM indexes.
Now, the problem is that for no apparent reason, after aprox 4-5 minutes, it received a SIGTERM and restarts. No other errors in the log. I thought maybe docker mem limits are exceeded, but docker stats does not show the container to be exceeding ~500MB.
And this is really consistent, like that 4-5 min interval just keeps repeating.
Perhaps the service manager is configured to monitor logstash by connecting to it on some port, and it is failing to connect, so it restarts it because it thinks it is unresponsive. I recall someone mentioning an issue like this a couple of years ago, but I cannot find the thread.
Hmm, good point, I'll check. Well, there's no difference regarding how I setup containers in the ECS cluster, and this is the only one having this problems. One thing which is different indeed is that logstash is sitting behind a network load balancer which redirects UDP packets on 12201(as I use gelf input)
Just to let you know, indeed it was a monitoring problem, because I had it behind a load balancer, but the protocol is UDP which does not work for health checks. The fix was to override the default health checks of the load balancer to use port 9600 TCP(logstash metrics). Of course, I had to map 9600 container port to the host, but maybe this depends on your setup.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.