Indexing throttling in Elasticsearch

Hello all,

We have number of periodic indexing tasks that perform bulk indexing of a few(let's say 10) million documents daily partitioned into pages of size 5K ( we have 200 tasks each having a page of 5K documents).

When multiple pages(more than 80) are pushed to Elasticsearch for indexing, simulatenously, index queue of the Elasticsearch gets flooded and performance degrades for search and indexing clusterwide. Increasing the index queue_size provides some improvement but seems like a band-aid.

Unfortunately our task queue does not allow to throttle indexing tasks in application side.

So is there a way in Elasticsearch to throttle indexing tasks (without dropping the index requests of course)
latency is not a big problem in our case, indexing taking longer time is acceptable.

We have around 10 nodes in the cluster (Elasticsearch v 2) each acting as a master-eligible data node.

Any ideas or suggestions?
Best

1 Like

Queues are useful for handling variable load. When they fill up, the subsequent rejections are a form of backpressure that clients should use to throttle themselves. If for some reason throttling is not acceptable, it means the cluster is underprovisioned for the load and needs additional capacity.

Let’s think through what it would mean for Elasticsearch to throttle itself though without the clients backing off. That means Elasticsearch has to buffer all these requests. Eventually it will run out of capacity to do that and will have to start rejecting requests. All we have done is move the backpressure problem. You might say hold on, it won’t run out of capacity if I put a giant disk behind it and Elasticsearch spills the requests to disk. We have just reinvented a persistent queue and we already have a solution for that: Logstash.

So here’s where I am on this: if you’re overwhelming Elasticsearch you either need to make your cluster bigger or apply throttling client side.

4 Likes

Logstash seems the answer to my question. Thanks: +1

You’re welcome!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.