Hello,
On one of our larger stacks, we have recently seen (twice last week) a problem seemingly related to the new 'desired balance allocator'. The documentation seems to imply this is purely a background operation so that it wouldn't block any other cluster tasks, however we can't think of any other culprits in the situations we've encountered.
We're running version 8.6.2 with a backport of #93461.
Another thing that may be relevant: we've set cluster.routing.allocation.balance.shard
to 0 to prioritise index and disk balancing.
The series of events that we saw were:
- Master node CPU gets pinned to one core fully worked up, and remains there (until resolution)
- During that time, ILM decides to do a rollover
- New index is created, alias rolled over, however new shards remain unassigned
- Ingest is now broken
Hot threads on master node shows:
Hot threads at 2023-03-25T21:42:34.520Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
100.0% [cpu=100.0%, other=0.0%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[elasticsearch-0-es-master-0][generic][T#1]'
2/10 snapshots sharing following 16 elements
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.decider.AllocationDeciders.canRemain(AllocationDeciders.java:116)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.decideMove(BalancedShardsAllocator.java:876)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.moveShards(BalancedShardsAllocator.java:834)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator.allocate(BalancedShardsAllocator.java:183)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceComputer.compute(DesiredBalanceComputer.java:253)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.lambda$processInput$0(DesiredBalanceShardsAllocator.java:111)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1$$Lambda$7874/0x0000000802233660.run(Unknown Source)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator.recordTime(DesiredBalanceShardsAllocator.java:304)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:108)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:100)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.ContinuousComputation$Processor.doRun(ContinuousComputation.java:92)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:917)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base@19.0.2/java.lang.Thread.run(Thread.java:1589)
2/10 snapshots sharing following 17 elements
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$NodeSorter.reset(BalancedShardsAllocator.java:1419)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.buildWeightOrderedIndices(BalancedShardsAllocator.java:782)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.balanceByWeights(BalancedShardsAllocator.java:647)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.balance(BalancedShardsAllocator.java:506)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator.allocate(BalancedShardsAllocator.java:184)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceComputer.compute(DesiredBalanceComputer.java:253)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.lambda$processInput$0(DesiredBalanceShardsAllocator.java:111)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1$$Lambda$7874/0x0000000802233660.run(Unknown Source)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator.recordTime(DesiredBalanceShardsAllocator.java:304)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:108)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:100)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.ContinuousComputation$Processor.doRun(ContinuousComputation.java:92)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:917)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base@19.0.2/java.lang.Thread.run(Thread.java:1589)
4/10 snapshots sharing following 14 elements
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator$Balancer.balance(BalancedShardsAllocator.java:506)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator.allocate(BalancedShardsAllocator.java:184)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceComputer.compute(DesiredBalanceComputer.java:253)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.lambda$processInput$0(DesiredBalanceShardsAllocator.java:111)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1$$Lambda$7874/0x0000000802233660.run(Unknown Source)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator.recordTime(DesiredBalanceShardsAllocator.java:304)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:108)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:100)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.ContinuousComputation$Processor.doRun(ContinuousComputation.java:92)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:917)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base@19.0.2/java.lang.Thread.run(Thread.java:1589)
2/10 snapshots sharing following 13 elements
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.BalancedShardsAllocator.allocate(BalancedShardsAllocator.java:181)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceComputer.compute(DesiredBalanceComputer.java:253)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.lambda$processInput$0(DesiredBalanceShardsAllocator.java:111)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1$$Lambda$7874/0x0000000802233660.run(Unknown Source)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator.recordTime(DesiredBalanceShardsAllocator.java:304)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:108)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.DesiredBalanceShardsAllocator$1.processInput(DesiredBalanceShardsAllocator.java:100)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.cluster.routing.allocation.allocator.ContinuousComputation$Processor.doRun(ContinuousComputation.java:92)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:917)
app/org.elasticsearch.server@8.6.2-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base@19.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base@19.0.2/java.lang.Thread.run(Thread.java:1589)
Unassigned shards are marked as being allowed to be assigned to the relevant nodes, however they are marked as 'no attempt' made at assigning them. Master node has those two tasks seemingly stuck:
indices:admin/rollover 20:41:35 1h
indices:admin/settings/update 20:41:35 1h
To resolve the situation, the first time we simply killed the master and let another master-eligible node take over; the second time we updated the balance setting mentioned above from 0 to some small non-zero value, which presumably reset or interrupted the desired balance calculation and allowed the master to resume normal operations.
Is this a known bug? Could it be related to the balance settings we're using, or some other thing?