Kibana 5.6.9 Advanced Node Tab Throwing a 503

mstruve · May 25, 2018, 4:18pm

My problem is pretty similar to this one. We have a dedicated monitoring cluster with 2 nodes that we upgraded to 5.6.9 from 5.4 a month ago. Recently, we have not been able to load the "Advanced Tab" for any individual node. Every time we try we get a 503 search_phase_execution_exception. After looking into the error logs I found that this is due to us blowing through the thread limit for searches.

Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.action.search.FetchSearchPhase$1@6fed4e8b on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@39018d64[Running, pool size = 7, active threads = 7, queued tasks = 1003, completed tasks = 111115001]]

This monitoring cluster is not heavily used and when I pull up the stats for searching on it you can see it really is not doing anything until we try to load one of the monitoring pages. Both node search stats look similar to this one.

Other Info:

Timeframe I am looking at is 1 hour
All indexes on the cluster are green
We have tried to close old indexes in an effort to solve this issue but it has not worked. The cluster currently has 280 open .monitoring-es and .monitoring-kibana indexes

Unlike our main cluster which I have control over how we handle searching I don't know much about how Kibana executes it searches so I am not sure what the best way to go about fixing this is.

Thanks in advanced for any help!

Molly

tsullivan · May 25, 2018, 7:11pm

Hi, Molly,

Sorry to hear your team is having this frustrating issue.

You can take a look at the real-time search queue size in Kibana by making a line chart on the monitoring data:

Y-Axis: Max
Field: node_stats.thread_pool.search.queue
X-Axis: Date histogram
Field: timestamp
Interval: Custom / 10s

That will give you a chart that looks kind of like this:

As you can see, I tried to put some search load on my cluster, and I did that by clicking the pause/refresh button in the Advanced Node page repeatedly.

If you close all other browser pages searching against the monitoring data, and watch that chart for awhile, you should be able to see the queue go down. I would wait quite a bit and see you can get it to go really down. Once it is really down, you should be able to open the Advanced Node page in the monitoring application.

mstruve · May 25, 2018, 7:51pm

Thanks for the response @tsullivan!

My visualization looks very similar to yours. It never really increases past two and when I try to load that advanced node page the page throws a 503 and there is no change in this graph. If I view the node_stats.thread_pool.search.rejected max it's a flatline which does not seem right considering the error I am seeing.

mstruve · June 11, 2018, 9:45pm

We seem to have fixed the problem by closing all our indexes up until March 1st. My guess is the queuing has to do with the number of indexes if tries to search for each request. I noticed when I viewed the logs that it does not limit the indexes by date despite the time window requested for a search.

system · July 9, 2018, 9:45pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rejected execution during search Elasticsearch	2	3129	June 4, 2018
Kibana 5.5.0 - Node Advanced Tab Error Kibana	5	956	May 25, 2018
Kibana Error - Http status code 503 Kibana	2	4032	December 5, 2017
Kibana \ Elasticsearch - Search_phase_execution_exception, all shards failed Kibana	2	25528	April 15, 2021
Rejected execution Elasticsearch	9	6276	August 11, 2020

Kibana 5.6.9 Advanced Node Tab Throwing a 503

Related topics