In Kibana, MetricBeat System overview dashboard fails to display data with timeout error message

(Yogesh Mishra) #1

Subject: In Kibana, MetricBeat System overview dashboard fails to display data with timeout error message

Hi,

I have the following Elasticsearch and Kibana setup on an Ubuntu box.

OS: Ubuntu 16.10 (yakkety)

Elasticsearch version: 6.2.3

Kibana version : 6.2.3

metricbeat version: 6.2.3

RAM on Ubuntu: 24 GB

RAM allocated to Elasticsearch: 12 GB

CPU Cores: 8

In Elasticsearch a new index is created daily and this index gets continuous performance data from metricbeat .

Imported metricbeat dashboard as suggested by elasticsearch documentation in Kibana.

Now, when I go to dashbards section in Kibana and click on "System Overview" dashboard for 1 year , the dashboard does not show any data.
Elasticsearch performance decreases considerably. even querying elasticsearch host:9200 takes some time.

I get the following error in elasticsearch.log

2019-02-12T04:32:06,350][DEBUG][o.e.a.s.TransportSearchAction] [AB6Tx6q] [metricbeat-2019.02.12][4], node[AB6Tx6qiS1uQC1YB1mWMQg], [P], s[STARTED], a[id=_pYPGRx-Rgy_jfGWezmYmw]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[metricbeat-*], indicesOptions=IndicesOptions[id=39, ignore_unavailable=true, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=, routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=5, batchedReduceSize=512, preFilterShardSize=42, source={"size":0,"query":{"bool":{"must":[{"range":{"@timestamp":{"from":1549963071912,"to":1549963971912,"include_lower":true,"include_upper":true,"format":"epoch_millis","boost":1.0}}},{"bool":{"must":[{"query_string":{"query":"beat.name:"elk-server"","fields":,"type":"best_fields","default_operator":"or","max_determinized_states":10000,"enable_position_increments":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"escape":false,"auto_generate_synonyms_phrase_query":true,"fuzzy_transpositions":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"aggregations":{"d3c67db1-1b1a-11e7-b09e-037021c4f8df":{"filter":{"match_all":{"boost":1.0}},"aggregations":{"timeseries":{"date_histogram":{"field":"@timestamp","time_zone":"Asia/Kolkata","interval":"10s","offset":0,"order":{"_key":"asc"},"keyed":false,"min_doc_count":0,"extended_bounds":{"min":1549963071912,"max":1549963971912}},"aggregations":{"d3c67db2-1b1a-11e7-b09e-037021c4f8df":{"max":{"field":"system.diskio.read.bytes"}},"f55b9910-1b1a-11e7-b09e-037021c4f8df":{"derivative":{"buckets_path":["d3c67db2-1b1a-11e7-b09e-037021c4f8df"],"gap_policy":"skip","unit":"1s"}},"dcbbb100-1b93-11e7-8ada-3df93aab833e":{"bucket_script":{"buckets_path":{"value":"f55b9910-1b1a-11e7-b09e-037021c4f8df[normalized_value]"},"script":{"source":"params.value > 0 ? params.value : 0","lang":"painless"},"gap_policy":"skip"}}}}}}}}}] lastShard [true]

org.elasticsearch.transport.RemoteTransportException: [AB6Tx6q][:9300][indices:data/read/search[phase/query]]

Caused by:
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.common.util.concurrent.TimedRunnable@1881ae0 on QueueResizingEsThreadPoolExecutor[name = AB6Tx6q/search, queue capacity = 1000, min queue capacity = 1000, max queue capacity = 1000, frame size = 2000, targeted response rate = 1s, task execution EWMA = 386nanos, adjustment amount = 50, org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor@56b3477[Running, pool size = 13, active threads = 10, queued tasks = 1527, completed tasks = 529703]]

at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.3.jar:6.2.3]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_131]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.common.util.concurrent.QueueResizingEsThreadPoolExecutor.doExecute(QueueResizingEsThreadPoolExecutor.java:88) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.search.SearchService.lambda$rewriteShardRequest$0(SearchService.java:994) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:113) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:86) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.search.SearchService.rewriteShardRequest(SearchService.java:992) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:312) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:372) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:369) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.TransportService.sendLocalRequest(TransportService.java:650) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.TransportService.access$000(TransportService.java:77) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.TransportService$3.sendRequest(TransportService.java:138) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:598) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:518) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:558) ~[elasticsearch-6.2.3.jar:6.2.3]
at org.elasticsearch.transport.TransportService.sendChildRequest(TransportService.java:549) ~[elasticsearch-6.2.3.jar:6.2.3]

I have tried changing the number of queues from 1000 to 10,000 by doing the following setting in elasticsearch.yml file.
thread_pool.search.queue_size: 10000

But even this did not help in any way to resolve the issue.

Is there any other solution apart from increasing CPU which i found on couple of places, Could you please suggest what steps can be taken to resolve this issue?

(Yogesh Mishra) #2

Hi All,

Would like to add some additional information about the issue.

We using iframe of metricbeat dashboard in web application. and it become unresponsive when click metricbeat dashboard from our web application.

it works with some glitches when clicked from kibana directly.

Thanks in Advance ...

Regards,
Yogesh

(system) closed #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.