I have a single node cluster. Recently I've updated to the latest version of ELK (6.2.3).
I'm not sure if my problem is related to updating to a newer version or it's just a coincidence, but I've noticed that my scroll requests started constantly failing on 1 shard out all of all available.
My python script fails with the following message:
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 394, in scan (resp['_shards']['successful'], resp['_shards']['total']) elasticsearch.helpers.ScanError: Scroll request has only succeeded on 555 shards out of 556.
Corresponding elasticsearch logs:
[2018-05-21T12:00:20,633][DEBUG][o.e.a.s.TransportSearchScrollAction] [elk2]  Failed to execute query phase org.elasticsearch.transport.RemoteTransportException: [elk2][127.0.0.1:9300][indices:data/read/search[phase/query/scroll]] Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.common.util.concurrent.TimedRunnable@61654bb5 on QueueResizingEsThreadPoolExecutor[name = elk2/search, queue capacity = 1000, min queue capacity = 1000, max queue capacity = 1000, frame size = 2000, targeted response rate = 1s, task execution EWMA = 1.7ms, adjustment amount = 50, org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor@1eef7d00[Running, pool size = 61, active threads = 51, queued tasks = 952, completed tasks = 4451827077]] at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.3.jar:6.2.3] at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) ~[?:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) ~[?:1.8.0_161] ...
Full traceback: https://pastebin.com/ksggHYRT
Could you please point me how to solve this issue?