Logstash is not load-balanced properly

Rakesh_B · April 27, 2020, 8:01pm

Hi,

We are using logstash version: 7.6.1, from the helm chart - https://github.com/elastic/helm-charts/tree/master/logstash.

resources:

  requests:
    cpu: "2"
    memory: "6Gi"
  limits:
    cpu: "2"
    memory: "6Gi"
logstashJavaOpts: "-Xmx3g -Xms3g"
logstashPersistenceStorage: 6Gi
logstashReplicaCount: 2

logstash config:

  logstash.yml: |
      http.host: 0.0.0.0
      config.reload.automatic: "true"
      queue.type: persisted
      queue.checkpoint.acks: 0
      queue.checkpoint.writes: 0
      queue.checkpoint.interval: 0
      queue.drain: "true"
      queue.max_bytes: 3gb  # disk capacity must be greater than the value of `queue.max_bytes`
      pipeline.workers: 4
      pipeline.batch.size: 5000
      #pipeline.batch.delay: 50
      # X-Pack
      http.host: "0.0.0.0"
      xpack.monitoring.enabled: true

kubernetes logstash service is behind an AWS internal ELB, for some reason all the requests to logstash go to only on logstash pod and the second pod is mostly idle.
here's the ELB configuration:

service:
  annotations:
    external-dns.alpha.kubernetes.io/hostname: logstash.{{ .Environment.Values.dnsZoneName }}
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"
    service.beta.kubernetes.io/aws-load-balancer-extra-security-groups: {{ .Environment.Values.logstashSecurityGroup }}
  type: LoadBalancer
  ports:
    - name: beats
      port: 5044
      protocol: TCP
      targetPort: 5044
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8080

This happens in all environments
here is a screenshot of CPU:

As you can see the blue line is for logstash-0 and purple is for logstash-1 pod, at any given point of time only ONE POD IS USED. This doesn't happen with any other services, we use the same networking for kibana but there are no load-balancing issues there.

Events rate screenshots:

logstash-0:

logstash-1:

One more question:
we give 6GB of RAM for logstash pods, how much of that can be allotted for HEAP space, right now we set it to 3GB of heap space, is that appropriate or do we need to adjust it?
Based on what values should we adjust the heap space?

here's a heap space graph for the pod which is used most of the time:

Rakesh_B · April 29, 2020, 6:34pm

any update here please?

Thank you

Luca_Belluccini · April 29, 2020, 6:48pm

Let's address the first problem.

If I correctly understand, 2 Logstash instances receive events from Filebeat or Metricbeat passing through an Amazon load balancer.

Can you please share the filebeat.yml file?
In particular, the Logstash output section.

In the docs we suggest using a ttl to force Filebeat to reconnect as Filebeat uses TCP persistent connections.

In particular, you can try to add:

pipelining: 0
ttl: 1m

Rakesh_B · April 29, 2020, 7:33pm

disabling pipelining and enabling TTL solved our problem.

Let's move on to the second question please:

we give 6GB of RAM for logstash pods, how much of that can be allotted for HEAP space, right now we set it to 3GB of heap space, is that appropriate or do we need to adjust it?
Based on what values should we adjust the heap space?

Thank you

Luca_Belluccini · April 29, 2020, 7:49pm

The JVM Heap size has to be sized depending on:

the number of parallel pipelines
the batch size for each pipeline
the number of workers per pipeline
the average of the events being processed
the filters you're using (e.g. some filters cache values on memory)
the number of connections open (both input & output connections)
many other variables

To conclude, I would enable the Stack Monitoring as you've done and I would check the frequency of JVM Garbage Collections & the GC durations.
Also ensure you're using the jvm.options file shipped with Logstash (excluding the size of the JVM Heap).

Other suggestions:

Profiling the JVM Heap https://www.elastic.co/guide/en/logstash/7.6/tuning-logstash.html#profiling-the-heap
Heap sizing https://www.elastic.co/guide/en/logstash/7.6/heap-size.html

system · May 27, 2020, 7:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash OOM - understanding heap sizing Logstash	14	13017	November 9, 2017
Logstash heap size Logstash	6	7026	July 6, 2017
Logstash fell under load Logstash	15	748	May 10, 2021
Filebeat events are not distributed equally to the logstash pods Logstash docker	2	61	March 20, 2025
Logstash high load and CPU usage Logstash	8	997	April 18, 2024

Logstash is not load-balanced properly

Related topics