Kubernetes Logstash statefulset under a Service of type Load Balancer (Amazon EKS) uneven distribution of events between pods

I am working with a self managed elasticsearch cluster hosted in Amazon EKS.
The pipeline flow is:

  1. filebeat agent is deployed in all ec2 servers seding data to logstash. https://logstash.company.com:5046.
  2. Logstash is deployed as a statefulset. A service of type load balancer pointing to the pods of this statefulset. Route53 endpoint logstash.company.com -> load balancer dns.
  3. Logstash config is pipeline managed in kibana/es.
  4. I see that there is a huge difference in the events received count for each pod, some of them are like 500m and some are like 50million for the same time from when they started.

Below are sample screenshots:

service definition:

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: 10.0.0.0/8
    service.beta.kubernetes.io/aws-load-balancer-type: elb
  labels:
    app: lg-endpoint-eks-filebeat-1
  name: lg-endpoint-eks-filebeat
  namespace: elasticsearch
spec:
  ports:
  - name: http-d
    nodePort: 31415
    port: 5046
    protocol: TCP
    targetPort: 5046
  - name: metrics
    nodePort: 30315
    port: 80
    protocol: TCP
    targetPort: 9600
  selector:
    app: lg-endpoint-eks-filebeat-1
  type: LoadBalancer

Not able to figure out why the way it is.
Also, this service definition creates a Amazon Classic load balancer.

Does loadbalancer has some stickiness ? or does filebeat has anything to do with this behaviour ?

The loadbalancer balances connection attempts. If you have multiple clients they will often get distributed across multiple servers. If you have a single filebeat it will establish a connection to one of the servers and keep re-using it, so basically all the events go to one server until the client restarts.

I have 100s of agents connecting to logstash, Is there any way that I can distribute the load ?
Does adopting ingress help instead of load balancer ?
or any other approach ?

I found that we should be using the ttl config in filebeat.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.