Elasticsearch pending_tasks

Hello everyone:

ES nodes are normal, cluster state is green, but the cluster appears a large number of pending_tasks, and the task in the queue for several minutes, how should be processed, there is no solution.
The following is the content of pending_tasks:

insertOrder timeInQueue priority source
96568 1.3m URGENT create-index [brmem_bc_m_star_1], cause [api]
96569 1.1m URGENT create-index [sts_bc_m_star_4], cause [api]
96571 50.3s URGENT create-index [brmem_bc_m_star_1], cause [api]
96570 1m URGENT create-index [sts_bc_m_11], cause [api]
96575 37s URGENT create-index [sts_bc_m_4], cause [api]
96572 38.1s URGENT shard-started
96573 38.1s URGENT shard-started
96574 38s URGENT shard-started
96577 32.5s HIGH add_listener
96576 33.6s URGENT create-index [sts_bc_m_star_11], cause [api]
96578 20.2s URGENT create-index [brmem_bc__1], cause [api]
96579 7s URGENT create-index [sts_bc_4], cause [api]
96580 3.5s URGENT create-index [sts_bc_11], cause [api]

This suggests your master node is having trouble processing cluster state updates caused by the simultaneous creation of a bunch of indices. Is it working its way through these tasks at all? In other words, if you GET /_cluster/pending_tasks repeatedly do you see the list shrinking over time?

Some possible causes:

  • Too many shards in your cluster. Every time you create an index the master must allocate its shards somewhere, and it takes into account the locations of all the other shards when doing so. The more shards you have, the longer this calculation can take.

  • Overloaded master. Similar to the previous point, if the master is busy doing non-master things then it has fewer resources available to process cluster state updates quickly.

  • Failing/slow node. Cluster state updates happen in order, with each update waiting for all nodes to acknowledge before moving onto the next one, so a failing node can slow this down until it's removed from the cluster. The master logs should indicate if there's a node that is not acknowledging updates fast enough.

1 Like

Thank you very much for your answer.
shard-started, index-aliases and delete-index is slow also.
Is there any way to optimize it?

insertOrder timeInQueue priority source
104055 28.2s URGENT shard-started
104056 28.1s URGENT shard-started
104057 27.5s URGENT shard-started
104060 23.3s URGENT index-aliases
104058 27.4s URGENT shard-started
100002 28.9m HIGH add_listener
100041 28.4m HIGH add_listener
104061 22.5s URGENT delete-index

How many shards and indices do you have in the cluster? How many nodes?

Thank you. @ Christian_Dahlqvist
nodes:30
indices : 13519
shards : 82589
master node:3
data node:27

That is indeed a very large number of shards, and would explain why cluster updates are slow. I would recommend you try to reduce that quite dramatically. Please read this blog post for some guidance on target shard size and sharding practices.

Thank you very much. Has elastic done any pressure testing on the elasticsearch master node, and how many shards can it support?

Yes, please read the blog post linked earlier.

The blog post I linked to provides guidelines for the maximum number of shards you should aim to have on a node, so do not use this as a target. If you can have less, that is usually better.

When it comes to how many shards a single cluster can handle, it often depends on the use case and how often the cluster state, which holds information about all indices and shards, need to be updated. The larger the cluster state, the longer it takes to apply updates to indices, shards and mappings. For each update to the cluster state, the change also need to be distributed to all the nodes in the cluster, meaning that the cluster node count and how fast these respond also can play a part.

For clusters with relatively few updates, where updates may be allowed to take a bit longer, the number of shards that can be supported may therefore be considerably larger than for a use case with large number of indices created at specific times or frequently updated mappings. There is therefore no set limit to the number of shards a cluster can handle.

Thank you very much.

Thank you for your detailed answers.[quote="Christian_Dahlqvist, post:9, topic:150459, full:true"]
The blog post I linked to provides guidelines for the maximum number of shards you should aim to have on a node, so do not use this as a target. If you can have less, that is usually better.

When it comes to how many shards a single cluster can handle, it often depends on the use case and how often the cluster state, which holds information about all indices and shards, need to be updated. The larger the cluster state, the longer it takes to apply updates to indices, shards and mappings. For each update to the cluster state, the change also need to be distributed to all the nodes in the cluster, meaning that the cluster node count and how fast these respond also can play a part.

For clusters with relatively few updates, where updates may be allowed to take a bit longer, the number of shards that can be supported may therefore be considerably larger than for a use case with large number of indices created at specific times or frequently updated mappings. There is therefore no set limit to the number of shards a cluster can handle.
[/quote]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.