Occasionally I am getting following exception on my 6 node ES cluster 2.4.6
2018-04-08 03:04:09,903][DEBUG][action.admin.indices.mapping.put] [PROD Node1] failed to put mappings on indices [[index_XXXXX]], type [TYPE_yyyy]
ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [TYPE_yyyy]) within 30s]
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:361)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Each update to the cluster state is single threaded in order to ensure consistency, so a large cluster state or very frequent updates to it can cause delays.
You should be able to run the following command to get the size of your cluster state: curl -XGET 'localhost:9200/_cluster/state/master_node/*?pretty'
How many indices and shards do you have in the cluster? Are you using dynamic mappings? How many fields do you have per index? Do you have any indices where you have a very large and potentially ever growing list of types?
We do have some indices with around 60 types and fields upto 500 ( 60 and 500 are few of largest ones)
I thought, since 2.x cluster state propagation is incremental that master communicates only deltas. Also, why setting 5 minutes timeout is not honored?
That is a lot of indices and mappings, which could mean a large cluster state. Cluster state propagation is incremental in Elasticsearch 2.x, but if it is very large processing may still be slowed down. Can you check the size of your cluster state using the cluster state API?
It is 2 shards per index and we have 1 replica. There is no easy way for us to bring down number of indices based on our architecture. Is it advisable to consider multiple clusters as opposed to adding more nodes to our cluster, given that our indices number will keep growing? Currently, we have 6 M4.4x large machines (16vCPU, 64GB ram) in our cluster.
In previous response, I mentioned we have 5000 indices and that is incorrect. Number of indices is 2500 and total shards is 10000
Hi @ashokm
From personal experience it seems like you have a very high ratio of indices and shards per nodes. As each shard takes resources, I'd try to amend the data structure so it could fit into less indices (and maybe use filtered aliases to keep queries intact).
Aside from that, I can say that while elasticsearch v6.2.3 indeed enhances many of the index and cluster operations, numbers such as yours may still get update_mapping timeouts (evidently shown in https://github.com/elastic/elasticsearch/issues/30370).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.