Hello,
I am trying to deploy a multiple pod logstash Statefulset on a kubernetes cluster using the Input File type.
It looks like each pod is reading the same logs from the logfile placed on a PVC, and therefore we are getting duplicated logs in our Elastic instance.
Eg: 2 pods runnning --> 2 logs with same content being posted to Elastic.
Any hints on the configuration to get this solved?
Then that means logstash can't be deployed as statefulset in kubernetes and make it high available with more than one pod running?
The idea of using multiple pods was to improve perfromance and to be able scale horizontally.
I can't see any info about this in the Logstash Helm Chart: helm-charts/logstash at main · elastic/helm-charts · GitHub
I do not use Kubernetes, but to have an HA deployment of Logstash you need third-party tools, and it also depends on your input.
For example, if you are receiving data using a TCP or UDP input, you need a load balance in front of your Logstash, then you can have as many Logstash as you want, also if you are consuming data from Kafka, you can also have multiple Logstash.
But for the file input you need to read the file and track the position read, so to have 2 or more tools doing this adds a lot more of unnecessary complexity, that's one of the reasons that you should have just one tool reading the files.
Logstash alone has no support for any kind of HA deployment.
As mentioned, it depends on the input, but also in most of the time the performance issues or bottlenecks are not on Logstash side, but on the receiving side, so scale logstash horizontally maybe not help anything and can in some scenarios make things worse.
Also as mentioned, Logstash has no support for working on HA on its own, its need third-party tools and that your data uses some specific inputs that allows load-balancing for example.
Thanks for your feedback!
I am having a current set up with HTTP input that is working with an ingress handling the load balancing, but wanted to reduce the http calls I have inside my cluster with this File input, but if i can't scale... I need to double check what works better in my case scenario.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.