We're running v1.1.0 in production right now. We have 15 nodes and 15
shards w/ 4 replicas (75 total).
Recently we've ran into some problems that seem to be caused by using all
of the available threads and completely filling the queue. This is a
portion of what we see in the log:
[action.search.type] [Kang the Conqueror] [779661445] Failed to execute
fetch phase
Caused by:
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException:
rejected execution (queue capacity 1000) on
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler@10970469
My question is, is it safe to increase the thread pool and queue size as a
solution? Is it necessary to increase both? If my thoughts are correct,
increasing the size of the thread pool should increase the amount of
concurrent searches, which might eliminate the need for a larger queue?
There is plenty of CPU available on all of the machines when this happens.
Current settings are the default; which in our case is 72 for thread pool
size and 1000 queue size.
Does anyone have any suggestions on tweaking these settings? Any best
practices to following?
If this is not a proper solution what might be some alternatives?
If search requests are being queued, then that means you probably do not
have capacity for more concurrent searches. We are so many searches being
queued? A temporary spike in search requests or are some expensive queries
using up the existing threads?
Try increasing the queue size and monitoring with is the maximum achieved.
We're running v1.1.0 in production right now. We have 15 nodes and 15
shards w/ 4 replicas (75 total).
Recently we've ran into some problems that seem to be caused by using all
of the available threads and completely filling the queue. This is a
portion of what we see in the log:
[action.search.type] [Kang the Conqueror] [779661445] Failed to execute
fetch phase
Caused by:
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException:
rejected execution (queue capacity 1000) on
org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler@10970469
My question is, is it safe to increase the thread pool and queue size as a
solution? Is it necessary to increase both? If my thoughts are correct,
increasing the size of the thread pool should increase the amount of
concurrent searches, which might eliminate the need for a larger queue?
There is plenty of CPU available on all of the machines when this happens.
Current settings are the default; which in our case is 72 for thread pool
size and 1000 queue size.
Does anyone have any suggestions on tweaking these settings? Any best
practices to following?
If this is not a proper solution what might be some alternatives?
You might also keep an eye on what your disk utilization is like when the
search queue is filling up; CPU isn't the only possible bottleneck here.
--
The information transmitted in this email is intended only for the
person(s) or entity to which it is addressed and may contain confidential
and/or privileged material. Any review, retransmission, dissemination or
other use of, or taking of any action in reliance upon, this information by
persons or entities other than the intended recipient is prohibited. If you
received this email in error, please contact the sender and permanently
delete the email from any computer.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.