Process Cluster Event Timeout Exception on put-mapping

ashokm · April 8, 2018, 8:36pm

Hi All,

Occasionally I am getting following exception on my 6 node ES cluster 2.4.6

2018-04-08 03:04:09,903][DEBUG][action.admin.indices.mapping.put] [PROD Node1] failed to put mappings on indices [[index_XXXXX]], type [TYPE_yyyy]
ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [TYPE_yyyy]) within 30s]
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:361)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I have increased the timeout using following code

		client.admin().indices().preparePutMapping(index).setType(mapping).
				setSource(m.get(schemaMapping)).setTimeout("300s")
				.setMasterNodeTimeout("300s").execute().actionGet()

Even with this, I am still getting 30 sec timeouts. Why would it timeout even when I increased timeout to 5 minutes.

Any help is appreciated

Thanks a bunch

dadoonet · April 8, 2018, 8:48pm

IIRC many improvements happened the past years in 5.x and 6.x.

I'd suggest upgrading.

Christian_Dahlqvist · April 9, 2018, 5:33am

Each update to the cluster state is single threaded in order to ensure consistency, so a large cluster state or very frequent updates to it can cause delays.

You should be able to run the following command to get the size of your cluster state: curl -XGET 'localhost:9200/_cluster/state/master_node/*?pretty'

How many indices and shards do you have in the cluster? Are you using dynamic mappings? How many fields do you have per index? Do you have any indices where you have a very large and potentially ever growing list of types?

ashokm · April 9, 2018, 10:21pm

Here is the information

We have around 5000 indices and 10,000 shards
We are not using dynamic mappings
We do have some indices with around 60 types and fields upto 500 ( 60 and 500 are few of largest ones)

I thought, since 2.x cluster state propagation is incremental that master communicates only deltas. Also, why setting 5 minutes timeout is not honored?

Thank you

Christian_Dahlqvist · April 10, 2018, 8:56pm

That is a lot of indices and mappings, which could mean a large cluster state. Cluster state propagation is incremental in Elasticsearch 2.x, but if it is very large processing may still be slowed down. Can you check the size of your cluster state using the cluster state API?

ashokm · April 10, 2018, 9:33pm

Christian,

It is around 500MB if i dump the output of cluster state into a text file.

Thanks
Ashok

Christian_Dahlqvist · April 11, 2018, 4:57am

What is the compressed size as reported by the API I linked to?

ashokm · April 11, 2018, 1:52pm

API doesn't return this information. I am not sure if it available on 2.4 version we are on

Thanks
Ashok

Christian_Dahlqvist · April 12, 2018, 3:00pm

That is quite large and may very well cause the slowness. How come you have so many indices with just one primary and one replica shard?

ashokm · April 12, 2018, 4:01pm

It is 2 shards per index and we have 1 replica. There is no easy way for us to bring down number of indices based on our architecture. Is it advisable to consider multiple clusters as opposed to adding more nodes to our cluster, given that our indices number will keep growing? Currently, we have 6 M4.4x large machines (16vCPU, 64GB ram) in our cluster.

In previous response, I mentioned we have 5000 indices and that is incorrect. Number of indices is 2500 and total shards is 10000

iamredlus · May 3, 2018, 8:17pm

Hi @ashokm
From personal experience it seems like you have a very high ratio of indices and shards per nodes. As each shard takes resources, I'd try to amend the data structure so it could fit into less indices (and maybe use filtered aliases to keep queries intact).

Aside from that, I can say that while elasticsearch v6.2.3 indeed enhances many of the index and cluster operations, numbers such as yours may still get update_mapping timeouts (evidently shown in https://github.com/elastic/elasticsearch/issues/30370).

ashokm · May 3, 2018, 9:12pm

Thank you so much Lior. We will re-visit our architecture to reduce the number of indices

system · May 31, 2018, 9:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Getting exception Process ClusterEvent Timeout Exception after 5 minutes Elasticsearch	3	375	October 14, 2019
[ElasticSearch 2.2.0] I am occasionally getting Process Cluster Event Timeout Exception[failed to process cluster event (put-mapping [as]) within 30s] while bulk indexing documents Elasticsearch	8	13153	February 22, 2016
Getting process cluster event timeout exceptions while bulk indexing with error message failure to put mappings on indices Elasticsearch	4	3112	June 20, 2017
Index failed to process cluster event (put-mapping) within 30s Elasticsearch	4	3055	December 15, 2017
503 PUT mapping exceptions with large number of mappings Elasticsearch	3	2554	July 5, 2017

Process Cluster Event Timeout Exception on put-mapping

Related topics