How to tune Logstash for ES indexing speed

mrubin · September 5, 2018, 8:54pm

I am using Logstash 6.2.x with Kafka as input and ES 6.4.0 (hosted in Elastic Cloud) as the output. I needed to stop Logstash for some time while by ES cluster was being upgraded. Once I restarted Logstash (and there were several hours of messages backed up; let's call it 500,000 messages), I noticed that ES was indexing roughly 3,000 documents/sec on the primary shards. The CPU on the Logstash server was around 30%. How do I tell what the limiting factor is here? How do I ensure that Logstash is tuned so as to not be the bottleneck?

The ES documentation provides some tips such as "use bulk requests" and "use multiple threads". It doesn't seem like Logstash's ES output plugin provides these options, though it does say "This plugin attempts to send batches of events as a single request." What are the details of this behavior? Is it customizable?

How do I tell, with my ES hosted at Elastic and not having full access to metrics, whether my ES cluster is the bottleneck for this scenario and if so, what about it? Is it CPU bound (the charts under 'Performance' page on my dashboard are blank at the moment), I/O bound?

Thanks

Christian_Dahlqvist · September 6, 2018, 5:11am

Logstash uses bulk requests and multiple threads by default so unless you are running it on a node with very limited CPU it is likely that Elasticsearch is the bottleneck. On Elastic Cloud you can enable monitoring, which will give you visibility into what might be limiting performance.

evilbit77 · September 6, 2018, 6:05am

Are you getting 429 messages in your logstash logs? That's ES telling logstash to backoff the upload rate. If logstash is hovering around 30% CPU, I'm guessing ES is the limiting factor and sending back 429s.

It sounds like you're already looking at the "Tune for indexing speed" document:

https://www.elastic.co/guide/en/elasticsearch/reference/6.4/tune-for-indexing-speed.html

The most effective item I've found is increasing the refresh interval. We changed our refresh intervals from the default 1s to 15s on most indices, and our max EPS went from ~12,000 to ~16,000.

system · October 4, 2018, 6:05am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Finding bottleneck in pipeline Logstash	9	1530	March 1, 2022
Elasticsearch indexing rate from Logstash Elasticsearch	6	969	July 7, 2018
Logstash performance for indexing to ES Logstash	6	639	April 18, 2017
What's limiting my Elasticsearch? Elasticsearch	19	3561	July 5, 2017
Indexing rate, Logstash to Elasticsearch Elasticsearch	2	644	August 7, 2017

How to tune Logstash for ES indexing speed

Related topics