Customize filebeat connections

Hello,

We have been given a logstash endpoint, which goes against 2 replicas running on an infra of k8s. What we see is that there is no control of the logstash pod to which we connect, and that many times our clients end up in the same pod decreasing the throughput by half, since we see that the sessions that open the filebeats are constantly open until they have nothing to send.

Does the filebeat have any configuration options for the connection against the logstash to close and open every X seconds, every X events, or every X bytes?

I have also tried to increase the number of connections from each client, adding in the output of logstash "worker: 2" but I see that it does not work either.

Our filebeat config:

filebeat.inputs:
- type: filestream
  id: XXXXXXXXXXXXXX
  enabled: true
  harvester_limit: 4000
  paths:
    - XXXXXXX*.csv
  prospector.scanner.exclude_files: ['\.gz$']
  prospector.scanner.check_interval: 60s
  close.reader.on_eof: true
  ignore_older: 48h
  clean_inactive: 49h

output.logstash:
  hosts: ["XXXXXXX:XXXX"]
  ssl.certificate_authorities: ["/etc/filebeat/ca.crt"]
  worker: 2

Greetings and thanks in advance

What are you putting in the hosts? The two replicas or an entry point in your k8s infra?

An entrypoint. It is a loadbalancer VIP, that balance between all the workers.

This is the other point i am testing, set the fqdns of the 2 pods as:
my-pod-0.my-svc.my-namespace.svc.cluster.local
my-pod-1.my-svc.my-namespace.svc.cluster.local

And this fqdn resolve as the loadbalancer VIP.

But i also would like to know if at filebeat level i am able to set the number of connections and limit the time these connections are open.

Regards

I read about some issues in the past where it was need to point directly to the pods running logstash, not an entrypoint, to achieve a better load balancing, but I could not find the post about it.

If I'm not wrong you limit the number of connection with the worker setting, since you have worker: 2, it will have two workers connecting to your logstash endpoint.

To limit the time you may need to use the ttl setting, since you are behind a load balancer.

Connections from beats to logstash are sticky, so if you have a load balancer in front of logstash, this can lead to uneven balancing.

You may try to use the ttl setting (and also need to disable pipelining as explained in the documentation).

Many thanks,

As you said, i had to set "pipelining: 0", but also "loadbalance: true". The guide reference says loadbalance is true by default, but it seems that is not true, i have had to set "loadbalance: true" in the config to work. But with this config:

output.logstash:
  hosts: ["XXX.XXX.XXX.XXX:XXX"]
  ssl.certificate_authorities: ["/etc/filebeat/ca.crt"]
  worker: 2
  pipelining: 0
  loadbalance: true
  ttl: 1m

I get filebeat to open 2 sessions and renew them every minute. With this config i get a better balance between the 2 pods.

I have yet to test if I can specify the pod to connect to through its fqdn. If I manage to do so, I will share it here.

Regards

This is true if you have more than one hosts defined in the hosts setting, if you just have one the load balance is not done by filebeat, but by your ingress.

Just setting pipelining: 0 didn't work?

With one host as i have, with only pipelining: 0, it didn't work. Then, adding "loadbalance: true", it started to work as expected.

Finally I have not been able to connect to a specific pod, I understand because my service is not headless. But I managed to control the session balancing with this configuration in the k8s service:
externalTrafficPolicy: Local

With this configuration you can control from the loadbalancer the sessions to each worker/pod. Specifically I have configured a least_connections type balancing so that all pods have the same number of sessions.

Thanks for your help, because with these configurations I can say that I have managed to have a balanced flow.

Regards

It seems like you're facing connection issues with Logstash and are looking for ways to optimize Filebeat. To address the connection problem, you may want to consider adjusting the loadbalance option in your Filebeat configuration. Additionally, you can explore using the timeout setting to manage connections more effectively. Keep in mind that tuning these parameters may require some experimentation to find the best configuration for your specific use case. Best of luck with your optimization efforts! :+1: AC Football Cases

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.