Hi everyone,
I would be happy to get your help with a critical issue we have.
We run AWS ElasticSearch service (v7.10) with Logstash as a Docker container on ECS Fargate.
All of our environments are running the service (with the architecture above) and everything works there great, but in 1 specific env (with more traffic than the other environments) we miss some logs (we checked it with Python script that sends logs to Logstash but only 30% of the logs arrive to ElasticSearch).
Do you know what could be the reason? where should we start our research?
What is sending data to Logstash? Could it be that the Elasticsearch cluster is unable to keep up and backpressure is causing logs to be lost/dropped somewhere?
UDP does not guarantee delivery nor handle back pressure so if Logstash or Elasticsearch can not keep up data will be lost. Only way to guarantee delivery is to move away from UDP.
Note that it is not just high load that may cause data loss. Any kind of hiccup could result in data loss, although possibly at a smaller scale that is more difficult to notice.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.