Kibana 5.5.0 - Node Advanced Tab Error

I've just upgraded some of my clusters to ES 5.5.0, this included upgrading the associated monitoring and kibana nodes to 5.5.0 as well. After doing so the "Advanced" Tab for each node shows the following error:

Looking at the ES logs for the monitoring cluster I see the following warnings:

[2017-07-19T15:57:33,278][WARN ][rest.suppressed          ] path: /.monitoring-es-2-*%2C.monitoring-es-6-*/_search, params: {size=0, ignore_unavailable=true, index=.monitoring-es-2-*,.monitoring-es-6-*}
org.elasticsearch.action.search.SearchPhaseExecutionException:
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:271) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:92) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.onRejection(AbstractRunnable.java:63) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onRejection(ThreadContext.java:628) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:100) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:89) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.execute(AbstractSearchAsyncAction.java:286) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.FetchSearchPhase.run(FetchSearchPhase.java:81) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.executePhase(AbstractSearchAsyncAction.java:143) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:137) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:240) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.InitialSearchPhase.onShardResult(InitialSearchPhase.java:179) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.InitialSearchPhase.access$000(InitialSearchPhase.java:47) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.InitialSearchPhase$1.innerOnResponse(InitialSearchPhase.java:151) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.SearchActionListener.onResponse(SearchActionListener.java:44) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.SearchActionListener.onResponse(SearchActionListener.java:29) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:46) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1060) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1134) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1124) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1113) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.DelegatingTransportChannel.sendResponse(DelegatingTransportChannel.java:60) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.RequestHandlerRegistry$TransportChannelWrapper.sendResponse(RequestHandlerRegistry.java:111) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:331) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:327) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:644) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) [elasticsearch-5.5.0.jar:5.5.0]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.5.0.jar:5.5.0]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66-internal]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66-internal]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66-internal]
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.action.search.FetchSearchPhase$1@3edf1d62 on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@d732092[Running, pool size = 25, active threads = 25, queued tasks = 1000, completed tasks = 19312]]
    at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:50) ~[elasticsearch-5.5.0.jar:5.5.0]
    at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66-internal]
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66-internal]
    at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:94) ~[elasticsearch-5.5.0.jar:5.5.0]
    ... 27 more

Has anyone else seen this? Or rather, can anyone see the node Advanced tab at all in Kibana 5.5.0. I see this in both clusters I've upgraded.

Austin

Hi godber!

Thanks for including the stack trace, it's very helpful.

The key to this error message is the "caused by" of the stack trace:

Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.action.search.FetchSearchPhase$1@3edf1d62 on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@d732092[Running, pool size = 25, active threads = 25, queued tasks = 1000, completed tasks = 19312]]

There is a limited capacity to how many searches can be queued up at a time, which is a count across all shards in the cluster. The default capacity is 1000, and this page has crashed because the searches required to load it have hit that limit.

See https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html

The main theory on why this could happen is that in this cluster there is non-monitoring data that is being searched on, and those searches are hitting a lot of shards. If that is indeed the problem, then provisioning a dedicated monitoring cluster and exporting the monitoring data away from production data will help a ton.

Another thought is that the timepicker in the Monitoring UI is looking at a large window of time, and the monitoring cluster needs an increase in resource capacity in order to perform a search that hits that many shards of monitoring data.

Hope that helps,
-Tim

Thanks for the response Tim!

In both cases these are dedicated monitoring clusters with very little use. So I have doubts about other queries being the problem.

Shrinking the time to the most recent 15 minutes gets the page to render sometimes.

  • Austin

That is very odd that a dedicated monitoring cluster would have so many searches that it fills up the queue. BTW you can monitor the queue activity in the Monitoring UI under the Node Advanced chart titled "Read Threads."

You can look at the current thread_pool stats directly in the nodes stats: https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html If you stop using the monitoring app, then you should be able to see the threads in the search thread pool go down. That should help get some more insight and hopefully let you pinpoint the problem.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Anyone else coming to this topic, here's another thing to try in 5.6.x: https://discuss.elastic.co/t/kibana-5-6-9-advanced-node-tab-throwing-a-503/