Shards Taking a Long Time to Move Between Nodes - Cloud [7.1.1]

Hmm. That looks completely sequential. Is there any evidence that these chunks are being sent concurrently? I'd expect sometimes to see two received response messages in a row, without an intervening sent to, for instance.

Looking harder at the code, I see that if the sending node is the bottleneck then it'll fall back to sending chunks sequentially, and the timings here suggest that the sending node is indeed the bottleneck. It also looks like different JDKs have quite significant differences in the performance of their TLS implementations, so it'd be good to know what JDK you're using and whether an upgrade improves things for you.

We are using the one that is installed with the debian package.

https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html

That is Java version 12.0.1.

Release file:

IMPLEMENTOR="Oracle Corporation"
JAVA_VERSION="12.0.1"
JAVA_VERSION_DATE="2019-04-16"
MODULES="java.base java.compiler java.datatransfer java.xml java.prefs java.desktop java.instrument java.logging java.management java.security.sasl java.naming java.rmi java.management.rmi java.net.http java.scripting java.security.jgss java.transaction.xa java.sql java.sql.rowset java.xml.crypto java.se java.smartcardio jdk.accessibility jdk.internal.vm.ci jdk.management jdk.unsupported jdk.internal.vm.compiler jdk.aot jdk.internal.jvmstat jdk.attach jdk.charsets jdk.compiler jdk.crypto.ec jdk.crypto.cryptoki jdk.dynalink jdk.internal.ed jdk.editpad jdk.hotspot.agent jdk.httpserver jdk.internal.le jdk.internal.opt jdk.internal.vm.compiler.management jdk.jartool jdk.javadoc jdk.jcmd jdk.management.agent jdk.jconsole jdk.jdeps jdk.jdwp.agent jdk.jdi jdk.jfr jdk.jlink jdk.jshell jdk.jsobject jdk.jstatd jdk.localedata jdk.management.jfr jdk.naming.dns jdk.naming.rmi jdk.net jdk.pack jdk.rmic jdk.scripting.nashorn jdk.scripting.nashorn.shell jdk.sctp jdk.security.auth jdk.security.jgss jdk.unsupported.desktop jdk.xml.dom jdk.zipfs"
OS_ARCH="x86_64"
OS_NAME="Linux"
SOURCE=".:hg:e831fc6bca9e"

I could only see a couple that were two in a row. I think that was only due to a delay as it took longer than usual to get the response.

Do you know what JDK was used?

It was some flavour of JDK11, I'm told, so yours is newer.

Following the idea that there's something slow about the source node, it'd be good to look at its hot threads output. Does it look busy doing things related to recoveries?

I have tried installing version JDK LTS 11

Sending Node:

[2019-06-25T13:17:27,560][TRACE][o.e.t.T.tracer           ] [warm-1] [10818][internal:index/shard/recovery/file_chunk] sent response
[2019-06-25T13:17:27,563][TRACE][o.e.t.T.tracer           ] [warm-1] [26957][internal:index/shard/recovery/file_chunk] received request
[2019-06-25T13:17:27,564][TRACE][o.e.t.T.tracer           ] [warm-1] [26957][internal:index/shard/recovery/file_chunk] sent response
[2019-06-25T13:17:27,571][TRACE][o.e.t.T.tracer           ] [warm-1] [10819][internal:index/shard/recovery/file_chunk] received request
[2019-06-25T13:17:27,572][TRACE][o.e.t.T.tracer           ] [warm-1] [10819][internal:index/shard/recovery/file_chunk] sent response
[2019-06-25T13:17:27,573][TRACE][o.e.t.T.tracer           ] [warm-1] [26958][internal:index/shard/recovery/file_chunk] received request
[2019-06-25T13:17:27,574][TRACE][o.e.t.T.tracer           ] [warm-1] [26958][internal:index/shard/recovery/file_chunk] sent response
[2019-06-25T13:17:27,581][TRACE][o.e.t.T.tracer           ] [warm-1] [26959][internal:index/shard/recovery/file_chunk] received request
[2019-06-25T13:17:27,582][TRACE][o.e.t.T.tracer           ] [warm-1] [10820][internal:index/shard/recovery/file_chunk] received request
[2019-06-25T13:17:27,582][TRACE][o.e.t.T.tracer           ] [warm-1] [26959][internal:index/shard/recovery/file_chunk] sent response
[2019-06-25T13:17:27,582][TRACE][o.e.t.T.tracer           ] [warm-1] [10820]

Receiving Node:

[2019-06-25T13:17:27,544][TRACE][o.e.t.T.tracer           ] [warm-0] [26955][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,547][TRACE][o.e.t.T.tracer           ] [warm-0] [26955][internal:index/shard/recovery/file_chunk] received response from [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}]
[2019-06-25T13:17:27,553][TRACE][o.e.t.T.tracer           ] [warm-0] [26956][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,556][TRACE][o.e.t.T.tracer           ] [warm-0] [26956][internal:index/shard/recovery/file_chunk] received response from [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}]
[2019-06-25T13:17:27,562][TRACE][o.e.t.T.tracer           ] [warm-0] [26957][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,565][TRACE][o.e.t.T.tracer           ] [warm-0] [26957][internal:index/shard/recovery/file_chunk] received response from [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}]
[2019-06-25T13:17:27,572][TRACE][o.e.t.T.tracer           ] [warm-0] [26958][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,575][TRACE][o.e.t.T.tracer           ] [warm-0] [26958][internal:index/shard/recovery/file_chunk] received response from [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}]
[2019-06-25T13:17:27,580][TRACE][o.e.t.T.tracer           ] [warm-0] [26959][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,584][TRACE][o.e.t.T.tracer           ] [warm-0] [26960][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,585][TRACE][o.e.t.T.tracer           ] [warm-0] [26961][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,587][TRACE][o.e.t.T.tracer           ] [warm-0] [26962][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,587][TRACE][o.e.t.T.tracer           ] [warm-0] [26963][internal:index/shard/recovery/file_chunk] sent to [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}] (timeout: [15m])
[2019-06-25T13:17:27,588][TRACE][o.e.t.T.tracer           ] [warm-0] [26959][internal:index/shard/recovery/file_chunk] received response from [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}]
[2019-06-25T13:17:27,588][TRACE][o.e.t.T.tracer           ] [warm-0] [26961][internal:index/shard/recovery/file_chunk] received response from [{warm-1}{xyz}{abc}{***.***.***.***}{***.***.***.***:9300}{ml.machine_memory=16794214400, update_domain=11, ml.max_open_jobs=20, fault_domain=1, xpack.installed=true, box_type=warm}]
[2019-06-25T13:17:27,590][TRACE][o.e.t.T.tracer           ] [warm-0] [26964]

I'm not sure what the [269**] of these logs are, but on the sending node it was sending it to [108**] as well as [269**]. However, the receiving node was not getting any of the [108**] records.

I will look into the source node hot threads now.

It's the other way around. Node warm-1 is the receiving node for what looks like 2 recoveries from 2 different nodes. One of the senders is warm-0 using message IDs in the 269xx range, and the other sender is using message IDs in the 108xx range. We don't log the sender on the receiving end (although I'm not sure why).

The hot node most utilised thread is:

97.1% (485.6ms out of 500ms) cpu usage by thread 'elasticsearch[warm-0][[index][0]: Lucene Merge Thread #4]'
 4/10 snapshots sharing following 9 elements
   app//org.apache.lucene.util.bkd.BKDWriter.merge(BKDWriter.java:570)
   app//org.apache.lucene.codecs.lucene60.Lucene60PointsWriter.merge(Lucene60PointsWriter.java:219)
   app//org.apache.lucene.index.SegmentMerger.mergePoints(SegmentMerger.java:201)
   app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:161)
   app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4459)
   app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4054)
   app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)
   app//org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:101)
   app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)
 6/10 snapshots sharing following 8 elements
   app//org.apache.lucene.codecs.lucene60.Lucene60PointsWriter.merge(Lucene60PointsWriter.java:219)
   app//org.apache.lucene.index.SegmentMerger.mergePoints(SegmentMerger.java:201)
   app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:161)
   app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4459)
   app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4054)
   app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)
   app//org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:101)
   app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)

Is the the correct thing that it is supposed to be doing?

That looks like a reasonable thing for a node to be doing, using a whole CPU, but it's nothing to do with the recovery of a shard. Are there any threads mentioning methods in the org.elasticsearch.indices.recovery package? Are the transport_worker threads up to anything?

I believe in the last one, I did the wrong command.

I re-ran again with and got these as the top 3 results on the sending node:

9.4% (47ms out of 500ms) cpu usage by thread 'elasticsearch[warm-2][transport_worker][T#12]'
 9/10 snapshots sharing following 2 elements
   io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
   java.base@11.0.3/java.lang.Thread.run(Thread.java:834)

9.3% (46.3ms out of 500ms) cpu usage by thread 'elasticsearch[warm-2][transport_worker][T#13]'
 10/10 snapshots sharing following 2 elements
   io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
   java.base@11.0.3/java.lang.Thread.run(Thread.java:834)

3.7% (18.3ms out of 500ms) cpu usage by thread 'elasticsearch[warm-2][generic][T#5]'
 10/10 snapshots sharing following 44 elements
   java.base@11.0.3/sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
   java.base@11.0.3/sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:54)
   java.base@11.0.3/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:274)
   java.base@11.0.3/sun.nio.ch.IOUtil.read(IOUtil.java:245)
   java.base@11.0.3/sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:811)
   java.base@11.0.3/sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:796)
   app//org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:179)
   app//org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:160)
   app//org.elasticsearch.common.lucene.store.InputStreamIndexInput.read(InputStreamIndexInput.java:73)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler.sendFiles(RecoverySourceHandler.java:697)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:414)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:180)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:106)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$000(PeerRecoverySourceService.java:54)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:112)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:109)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:251)
   app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
   app//org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:192)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.lambda$messageReceived$0(SecurityServerTransportInterceptor.java:300)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$$Lambda$3228/0x0000000840b0a840.accept(Unknown Source)
   app//org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61)
   org.elasticsearch.xpack.security.authz.AuthorizationService.authorizeSystemUser(AuthorizationService.java:375)
   org.elasticsearch.xpack.security.authz.AuthorizationService.authorize(AuthorizationService.java:183)
   org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile.lambda$inbound$1(ServerTransportFilter.java:130)
   org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile$$Lambda$3230/0x0000000840b0b040.accept(Unknown Source)
   app//org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$2(AuthenticationService.java:246)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator$$Lambda$3232/0x0000000840b0b840.accept(Unknown Source)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$lookForExistingAuthentication$6(AuthenticationService.java:306)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator$$Lambda$3233/0x0000000840b0bc40.run(Unknown Source)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lookForExistingAuthentication(AuthenticationService.java:317)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.authenticateAsync(AuthenticationService.java:244)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.access$000(AuthenticationService.java:196)
   org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:139)
   org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile.inbound(ServerTransportFilter.java:121)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:307)
   app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63)
   app//org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1077)
   app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751)
   app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
   java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   java.base@11.0.3/java.lang.Thread.run(Thread.java:834)

This includes both the terms you were looking for.

The rest of the threads are at 0%

Interesting, I wasn't expecting to see that. This tells us that every time we sampled the threads on this node (ten times, spanning 500ms) the recovery source thread was blocked waiting to read data from disk.

I have moved the shard back and had seen this on the node moving it:

The unique snapshot mention something about a rate limiter?

app//org.apache.lucene.store.RateLimiter$SimpleRateLimiter.pause(RateLimiter.java:153

4.2% (20.9ms out of 500ms) cpu usage by thread 'elasticsearch[warm-0][generic][T#5]'
 9/10 snapshots sharing following 44 elements
   java.base@11.0.3/sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
   java.base@11.0.3/sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:54)
   java.base@11.0.3/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:274)
   java.base@11.0.3/sun.nio.ch.IOUtil.read(IOUtil.java:245)
   java.base@11.0.3/sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:811)
   java.base@11.0.3/sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:796)
   ...
 unique snapshot
   java.base@11.0.3/java.lang.Thread.sleep(Native Method)
   java.base@11.0.3/java.lang.Thread.sleep(Thread.java:339)
   app//org.apache.lucene.store.RateLimiter$SimpleRateLimiter.pause(RateLimiter.java:153)
   app//org.elasticsearch.indices.recovery.RemoteRecoveryTargetHandler.writeFileChunk(RemoteRecoveryTargetHandler.java:170)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler.lambda$sendFiles$35(RecoverySourceHandler.java:708)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler$$Lambda$4103/0x0000000840c6bc40.run(Unknown Source)
   app//org.elasticsearch.common.util.CancellableThreads.executeIO(CancellableThreads.java:108)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler.sendFiles(RecoverySourceHandler.java:707)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:414)
   app//org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:180)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:106)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$000(PeerRecoverySourceService.java:54)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:112)
   app//org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:109)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:251)
   app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
   app//org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:192)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.lambda$messageReceived$0(SecurityServerTransportInterceptor.java:300)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$$Lambda$3225/0x0000000840b0a840.accept(Unknown Source)
   app//org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61)
   org.elasticsearch.xpack.security.authz.AuthorizationService.authorizeSystemUser(AuthorizationService.java:375)
   org.elasticsearch.xpack.security.authz.AuthorizationService.authorize(AuthorizationService.java:183)
   org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile.lambda$inbound$1(ServerTransportFilter.java:130)
   org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile$$Lambda$3227/0x0000000840b0b040.accept(Unknown Source)
   app//org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$2(AuthenticationService.java:246)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator$$Lambda$3229/0x0000000840b0b840.accept(Unknown Source)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$lookForExistingAuthentication$6(AuthenticationService.java:306)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator$$Lambda$3230/0x0000000840b0bc40.run(Unknown Source)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lookForExistingAuthentication(AuthenticationService.java:317)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.authenticateAsync(AuthenticationService.java:244)
   org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.access$000(AuthenticationService.java:196)
   org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:139)
   org.elasticsearch.xpack.security.transport.ServerTransportFilter$NodeProfile.inbound(ServerTransportFilter.java:121)
   org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:307)
   app//org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63)
   app//org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1077)
   app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751)
   app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
   java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   java.base@11.0.3/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   java.base@11.0.3/java.lang.Thread.run(Thread.java:834)

Yes we will occasionally catch that thread doing something other than waiting on IO, and I think the rate-limiter will occasionally call Thread.sleep even if the average rate is lower than the target, because it's trying to limit the instantaneous rate instead. You can disable the rate limiter entirely by setting the rate to -1 if you want.

Still, 9 times out of 10 it was waiting on IO.

Do you know any possible solution?

It there a configuration that can edited?

I've been digging into the code and I have a hunch. Could you try closing an index, changing its store type to mmapfs, and then opening it again?

[EDIT: deleted the instructions that were originally here, because they were wrong, see below for correction]

Then could you trigger a recovery of a shard in this index and see if it's any faster?

To be clear, this is not a permanent fix, there are very good reasons not to use mmapfs as a matter of course, but if this improves the recovery performance then this gives us some potential options.

Hang on, there's a bug in my instructions. I will correct shortly.

I meant this:

POST /some_index/_close

PUT /some_index/_settings
{
  "store.type": "mmapfs"
}

POST /some_index/_open

To reset the store type back to the default, use this:

POST /some_index/_close

PUT /some_index/_settings
{
  "store.type": null
}

POST /some_index/_open

I have edited the index, it now has mmapfs.

It doesn't seem to increase the transfer speed though.

I checked the index.store.type before I edited it and it was set to a blank string. Is that mean't to be niofs.

I looked at the hot_threads API and it seems to be doing more things, but the transfer rate is still around the 40MBps mark.

The top 4 are as follows:

12.0% (60ms out of 500ms) cpu usage by thread 'elasticsearch[warm-0][transport_worker][T#5]'
 2/10 snapshots sharing following 2 elements
   io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
   java.base@11.0.3/java.lang.Thread.run(Thread.java:834)

11.7% (58.5ms out of 500ms) cpu usage by thread 'elasticsearch[warm-0][transport_worker][T#8]'
3/10 snapshots sharing following 31 elements
java.base@11.0.3/sun.security.ssl.SSLCipher$T11BlockReadCipherGenerator$BlockReadCipher.decrypt(SSLCipher.java:1337)
java.base@11.0.3/sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:240)
java.base@11.0.3/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:197)
java.base@11.0.3/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:160)
java.base@11.0.3/sun.security.ssl.SSLTransport.decode(SSLTransport.java:108)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:672)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:627)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:443)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:422)
java.base@11.0.3/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:634)
io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:295)
io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1301)
io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1203)
io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1247)
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502)
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441)
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656)
io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556)
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470)
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
java.base@11.0.3/java.lang.Thread.run(Thread.java:834)
6/10 snapshots sharing following 2 elements
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
java.base@11.0.3/java.lang.Thread.run(Thread.java:834)
1.4% (7ms out of 500ms) cpu usage by thread 'elasticsearch[warm-0][transport_worker][T#13]'
2/10 snapshots sharing following 36 elements
java.base@11.0.3/com.sun.crypto.provider.AESCipher.engineUpdate(AESCipher.java:410)
java.base@11.0.3/javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:777)
java.base@11.0.3/javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:555)
java.base@11.0.3/javax.crypto.Cipher.update(Cipher.java:2040)
java.base@11.0.3/sun.security.ssl.SSLCipher$T11BlockWriteCipherGenerator$BlockWriteCipher.encrypt(SSLCipher.java:1474)
java.base@11.0.3/sun.security.ssl.OutputRecord.t10Encrypt(OutputRecord.java:366)
java.base@11.0.3/sun.security.ssl.OutputRecord.encrypt(OutputRecord.java:274)
java.base@11.0.3/sun.security.ssl.SSLEngineOutputRecord.encode(SSLEngineOutputRecord.java:274)
java.base@11.0.3/sun.security.ssl.SSLEngineOutputRecord.encode(SSLEngineOutputRecord.java:175)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.encode(SSLEngineImpl.java:274)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.writeRecord(SSLEngineImpl.java:233)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:136)
java.base@11.0.3/sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:116)
java.base@11.0.3/javax.net.ssl.SSLEngine.wrap(SSLEngine.java:519)
io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:1023)
io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:829)
io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:797)
io.netty.handler.ssl.SslHandler.flush(SslHandler.java:778)
io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
io.netty.handler.logging.LoggingHandler.flush(LoggingHandler.java:265)
io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
io.netty.channel.AbstractChannelHandlerContext.access$1500(AbstractChannelHandlerContext.java:38)
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1152)
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1075)
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:474)
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
java.base@11.0.3/java.lang.Thread.run(Thread.java:834)
unique snapshot
java.base@11.0.3/sun.nio.ch.EPoll.wait(Native Method)
java.base@11.0.3/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
java.base@11.0.3/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
java.base@11.0.3/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:765)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:413)
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
java.base@11.0.3/java.lang.Thread.run(Thread.java:834)

 1.4% (6.9ms out of 500ms) cpu usage by thread 'elasticsearch[warm-0][transport_worker][T#12]'
     9/10 snapshots sharing following 2 elements
     io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
     java.base@11.0.3/java.lang.Thread.run(Thread.java:834)