I'm running logstash 6.4.2 in a container (image from elastic) on AWS ECS, here are some info:
- TCP Endpoint
- SSL_Enabled => true
- SSL_verify => true.
- Data = Syslog with custom ruby filter
- Logs are sent from thousands of devices (~10-20k) but with a limit of 100messages/5sec/device
- in jvm.options -Xmx is set to 3gb
Now to my problem, the container and logstash starts upp and the CPU of the container is somewhere between 150-280% and a few megabytes of memory is allocated each second up to ~1,4gb then the healthcheck fails and the container is killed and a new is started, this process takes somewhere from 4-8minutes.
I scaled up to use 20 containers on different instances with a loadbalancer in front of them to see if the if the load on each container would be less, that did't help since each container behaved about the same (same CPU and memory load) even when having 19 friends to share the load with.
Is this a case where logstash can't handle the amount of tcp connections or is it something else? the memory isn't close to the limit when it stalls and is then killed.
Since the container/application stops responding I unfortunately don't have any logs that provide any help.
Any advice is appreciated
Best Regards /Viktor