X-pack overview show N/A but indices .monitoring-es* was green


(John_ax) #1

Kibana Version: 5.5
Elasticsearch Version: 5.5
Plugins installed: [X-PACK]
java version "1.8.0_74"

the load was a little high, before 2017.07.27 00:00:00 all the charts were normal, and after 00:00:00(the new indices should be created,cause the high load of the disk,it was delayed,but finally, it has been created.) the kibana overview charts show N/A,
some relevant information show below,the necessary indices were created and status is green:

curl localhost:9400/_cat/shards -s | grep "2017.07.27"
.monitoring-es-6-2017.07.27 13 p STARTED 1797 2.6mb xx.xx.xx.xx IXKzGCg
.monitoring-es-6-2017.07.27 32 p STARTED 1896 2.7mb xx.xx.xx.xx CtAXYCY
.monitoring-es-6-2017.07.27 5 p STARTED 1869 2.8mb xx.xx.xx.xx CtAXYCY
.monitoring-es-6-2017.07.27 27 p STARTED 1844 2.8mb xx.xx.xx.xx 8oFbNtj
...
curl localhost:9400/_cat/indices/.monitoring*-2017.07.27
green open .monitoring-kibana-6-2017.07.27 n8ukJJHxTH-LpTFHhw2GJQ 1 1 1017 0 1.6mb 610.2kb
green open .monitoring-es-6-2017.07.27 UYscELTpTiyGDpwI2nNfXA 36 0 68826 36637 114.5mb 114.5mb
relevant log

[2017-07-27T11:05:52,549][WARN ][o.e.x.m.e.l.LocalExporter] unexpected error while indexing monitoring document
org.elasticsearch.xpack.monitoring.exporter.ExportException: RemoteTransportException[[IXKzGCg][61.160.36.54:9700][indices:data/write/bulk[s]]]; n
ested: RemoteTransportException[[IXKzGCg][XX.XX.XX.XX][indices:data/write/bulk[s][p]]]; nested: EsRejectedExecutionException[rejected execut
ion of org.elasticsearch.transport.TransportService$7@3d28c7b4 on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.c
oncurrent.EsThreadPoolExecutor@43b45b98[Running, pool size = 24, active threads = 24, queued tasks = 200, completed tasks = 44910211]]];
......

[2017-07-27T11:05:52,570][WARN ][o.e.x.m.MonitoringService] [CtAXYCY] monitoring execution failed
org.elasticsearch.xpack.monitoring.exporter.ExportException: Exception when closing export bulk
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1$1.(ExportBulk.java:106) ~[x-pack-5.5.0.jar:5.5.0]
....
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40]
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulks


(Felix Stürmer) #2

Hi @John_ax,

the error message indicates that the bulk processing queue of your Elasticsearch cluster is full and does not accept new tasks. This is probably what keeps monitoring from persisting new metrics. You can check the status of your thread pool using the _cat api, e.g.

curl "${ELASTICSEARCH_URL}/_cat/thread_pool?v

You might have to take steps to improve your cluster's performance or temporarily throttle you input volume until your cluster has caught up with a spike.


(John_ax) #3

thank you for the reply. wow I do some search,I have 17nodes ,the index rata is 14000/s,

'curl -s 'http://host/_cat/thread_pool?v''
6sG63uh bulk 9 0 184495

the rejected bulk is high,so i went to the node host,what i saw was the disk util is not high at all,:


image

I really could not figure out what possible configuration caused the high reject bulk tasks.


(Felix Stürmer) #4

Hm, too bad the monitoring data are not persisted due to the high load. For that reason I would recommend to use a dedicated monitoring cluster for production installations, maybe even off-site on Elastic Cloud. There is a free 14-day trial you could use to diagnose this particular problem for now.

As a stop-gap measure you might also be able to gain more insight using a third-party tool to display some cluster stats, such as http://www.elastichq.org/.

There are a lot of settings that can be used to tune the cluster (such as the refresh_interval), but which ones would be particularly helpful really depends on you cluster size, available resources and the workload.


(John_ax) #5

:cold_sweat:
but it's not high load at all.but the monitoring data was persisted now!
still ,es keep logging failed to flush export bulks, something wrong with the es,I think.
and I temporarily throttle the input volume, the bulk reject number didn't decrease?


(Felix Stürmer) #6

Ok, so there might be other reasons why the monitoring indices can not be written to. Did you manually change the shard allocation settings?

Could you check the output of curl "${ELASTICSEARCH_URL}/_cat/indices?v for RED indices or provide its output here?


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.