I have executed many times this way and you're right I see consistently fast results.
?preference=_only_local
Is there anything more I should try?
GET _nodes/hot_threads?ignore_idle_threads=false&threads=999999
With node-2 shutdown I got a large output all with 0.0% (0s out of 500ms) cpu usage by thread
When node-2 is active only few tasks have some low cpu load:
0.2% (1ms out of 500ms) cpu usage by thread 'elasticsearch[node-2][scheduler][T#1]'
10/10 snapshots sharing following 9 elements
java.base@12/jdk.internal.misc.Unsafe.park(Native Method)
java.base@12/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:235)
java.base@12/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
java.base@12/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1182)
java.base@12/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
java.base@12/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
java.base@12/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
java.base@12/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@12/java.lang.Thread.run(Thread.java:835)
0.1% (499.4micros out of 500ms) cpu usage by thread 'elasticsearch[node-2][transport_worker][T#6]'
10/10 snapshots sharing following 8 elements
java.base@12/sun.nio.ch.EPoll.wait(Native Method)
java.base@12/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
java.base@12/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
java.base@12/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:765)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:413)
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
java.base@12/java.lang.Thread.run(Thread.java:835)
0.1% (466.7micros out of 500ms) cpu usage by thread 'elasticsearch[node-2][transport_worker][T#2]'
10/10 snapshots sharing following 8 elements
java.base@12/sun.nio.ch.EPoll.wait(Native Method)
java.base@12/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
java.base@12/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
java.base@12/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:765)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:413)
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
java.base@12/java.lang.Thread.run(Thread.java:835)
0.1% (253.1micros out of 500ms) cpu usage by thread 'elasticsearch[node-2][refresh][T#1]'
10/10 snapshots sharing following 9 elements
java.base@12/jdk.internal.misc.Unsafe.park(Native Method)
java.base@12/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:235)
java.base@12/java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:740)
java.base@12/java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:684)
java.base@12/java.util.concurrent.LinkedTransferQueue.poll(LinkedTransferQueue.java:1374)
java.base@12/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1053)
java.base@12/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
java.base@12/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@12/java.lang.Thread.run(Thread.java:835)