I upgraded to ES5 and installed X-Pack, however I seem to be having some issues with it. I have 2 clients, 3 masters, and 5 data nodes. On all 10 servers it is now outputting the below error in the ES log multiple times per minute. Monitoring also seems to be having gaps in the data including reporting incorrect information like no shards on a data node, then in a few minutes it shows 300 shards. I'm assuming it's due to these errors but I'm not sure where to look. I installed the X-pack plugin on all nodes (client, master, data) and Kibana.
[2016-10-27T17:09:35,697][ERROR][o.e.x.m.AgentService ] [esc2-client] exception when exporting documents
org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulks
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:148) ~[x-pack-5.0.0.jar:5.0.0]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:77) ~[x-pack-5.0.0.jar:5.0.0]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:194) ~[x-pack-5.0.0.jar:5.0.0]
at org.elasticsearch.xpack.monitoring.AgentService$ExportingWorker.run(AgentService.java:208) [x-pack-5.0.0.jar:5.0.0]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulk [default_local]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:114) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:62) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:145) ~[?:?]
... 4 more
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: bulk [default_local] reports failures when exporting documents
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.throwExportException(LocalBulk.java:121) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:111) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:62) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:145) ~[?:?]
... 4 more
By default, the monitoring feature of X-Pack for Elasticsearch will have the metrics indexed to the local cluster itself. But it looks like in your cluster, the bulk queue is sometimes full and the monitoring agent can't index the data.
I am assuming you don't have a custom configuration for Monitoring for your Elasticsearch nodes, and that the monitoring data is getting indexed into your production cluster (look for .monitoring-* indices). I would recommend setting up a dedicated monitoring cluster. You would be better off not having the monitoring data on your production cluster, because if your cluster goes down, then you will have lost all the metrics that can help you understand the issue.
@x10Corey There should be more information in the log files, can you please double-check and copy/paste the whole stack trace here (or in a gist)? Thanks
For us this error came after the installation of elasticsearch 5.1.1 together with xpack.
But the error was also mentioning that there were no ingest node in the cluster.
Putting node.ingest:true in elasticsearch.yml made the error disappear.
It looks that monitoring is using ingest feature?
I find that users having issues with the default exporter usually do so because they are using a template that interferes with the .monitoring-* templates. This is usually from some sort of global template (where "template": "*"), which functionally changes the index pattern for Monitoring indices in an incompatible way.
Can you show the index definition for .monitoring-data-2?
That's not the same error. The error that this issue is dealing with is, at a high level, the failure of the Monitoring code to bulk index documents into the Monitoring cluster.
You are failing to query them, which is the other side. That's a worthwhile issue, but when you do come back for the error, please create a new discuss issue for that (for better discovery for others) and include the error message as well as versions of the stack that are installed.
Fortunately, this is a simple configuration issue. SERVICE_UNAVAILABLE/2/no master indicates that you did not have an elected master node in charge of your cluster when you sent your request. The issue appears to be that you only have 3 eligible master nodes (["10.0.2.9", "10.0.0.20", "10.0.1.1"]). However, your setting for discovery.zen.minimum_master_nodes is strict and set to 3.
This should be set to (M / 2) + 1, always rounded down. Therefore this should be set to 2 if you only have three 3 eligible master nodes. If you set it to 3, then any hiccup (or rolling restart for that matter) means that no master node can be elected.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.