Slow down Logstash output to Elasticsearch

natharran · May 28, 2025, 8:54am

Hello all.

This might seem a bit strange but I need to slow down the Logstash output rate to Elasticsearch. The thing is that Logstash reads data from Kafka topic and after processing sends them to Elasticsearch. That's all good and well as long as there is no huge queue of messages waiting.

However, if there are many messages waiting in Kafka topic (i.e. Logstash was down for any reason) Logstash pushes them to Elasticsearch so fast that our OpenShift starts emitting errors about "High memory pressure" and page faults and after a few minutes, ES stops receiving new documents until it recovers and then the process repeats itself.

I've increased ES memory (64 GB) and JVM heap (32 GB) per node, there are 4 ES data nodes on 4 separate machines (OpenShift workers), the index has 2 primary shards and 1 replica for each, I lowered LS batch size from 2000 to 512. I even applied the LS throttle filter and discard messages over limit per timeframe (which works, but doesn't seem to help).

I'll appreciate any help.

Rios · May 28, 2025, 1:19pm

Maybe you can set batch size and delay(increase)

Can the sleep plugin help? This with autoreload conf could help for some cases.

leandrojmp · May 28, 2025, 1:25pm

32 GB of Heap may be problematic as it is closer to the threshold for compressed oops, I would suggest that you check if you reached this limit as mentioned in the documentation and reduce it for something close to 30 GB maximum.

What is the disk type of your nodes? This have a huge influence in indexing speed.

Also, have you changed the index.refresh_interval for the index? The default is 1s which in my experience can be a performance killer, in my clusters I do not use a refresh_interval smaller then 15s.

How many workers is the pipeline configured to use? If you did not explicitly configured it, logstash will use the number of CPUs of the host. Maybe changing the batch size to the default of 125 and reducing the number of workers could help.

You may also try to change the kafka input plugin to reduce the number of records that it pulls, for example using max_poll_records.

natharran · May 28, 2025, 1:45pm

Hello @leandrojmp and thank you so much for the tips. The disks are high-write nvmes, so I'm not expecting a problem there. I will lower the Heap size and increase index refresh interval and will let you know if it helped.

As for the Logstash, the pipeline is set to 2 workers.

@Rios Thank you, I tested those but didn't reach successful outcome.

Thank you again.

Topic		Replies	Views
How to tune Logstash to Elasticsearch shipping Logstash	5	9616	July 6, 2017
Logstash Batch Size/Workers log message Logstash	14	12302	July 6, 2017
How to tune Logstash for ES indexing speed Logstash	3	1488	October 4, 2018
Increasing elasticsearch indexing rate Elasticsearch	14	12925	March 9, 2017
Logstash is processing logs too slow Logstash	4	3334	July 6, 2017

Slow down Logstash output to Elasticsearch

Related topics