How to tune Logstash for ES indexing speed

I am using Logstash 6.2.x with Kafka as input and ES 6.4.0 (hosted in Elastic Cloud) as the output. I needed to stop Logstash for some time while by ES cluster was being upgraded. Once I restarted Logstash (and there were several hours of messages backed up; let's call it 500,000 messages), I noticed that ES was indexing roughly 3,000 documents/sec on the primary shards. The CPU on the Logstash server was around 30%. How do I tell what the limiting factor is here? How do I ensure that Logstash is tuned so as to not be the bottleneck?

The ES documentation provides some tips such as "use bulk requests" and "use multiple threads". It doesn't seem like Logstash's ES output plugin provides these options, though it does say "This plugin attempts to send batches of events as a single request." What are the details of this behavior? Is it customizable?

How do I tell, with my ES hosted at Elastic and not having full access to metrics, whether my ES cluster is the bottleneck for this scenario and if so, what about it? Is it CPU bound (the charts under 'Performance' page on my dashboard are blank at the moment), I/O bound?


Logstash uses bulk requests and multiple threads by default so unless you are running it on a node with very limited CPU it is likely that Elasticsearch is the bottleneck. On Elastic Cloud you can enable monitoring, which will give you visibility into what might be limiting performance.

Are you getting 429 messages in your logstash logs? That's ES telling logstash to backoff the upload rate. If logstash is hovering around 30% CPU, I'm guessing ES is the limiting factor and sending back 429s.

It sounds like you're already looking at the "Tune for indexing speed" document:

The most effective item I've found is increasing the refresh interval. We changed our refresh intervals from the default 1s to 15s on most indices, and our max EPS went from ~12,000 to ~16,000.

