[ElasticSearch 2.2.0] I am occasionally getting Process Cluster Event Timeout Exception[failed to process cluster event (put-mapping [as]) within 30s] while bulk indexing documents

dynamicscope · February 20, 2016, 11:46am

Hello, ES users.

I am occasionally getting ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [as]) within 30s] while doing bulk indexing.

failed to execute bulk item (index) index {[uh-as-440-20150720][as][a1092e5ad6b925eb7c262b748695c0eb42e2342e::BCQzRjvdQpKtARfKTuGpvw==]
...
...
ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [as]) within 30s]
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:343)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Cluster runs without any problem for a while, but then time to time it throws such exception.
I am not quite sure what cause it.
The throughput to the ES cluster is steady.
(Currently, it's quiet. It wasn't while I was in sleep.)

Q. What should I check to fix this issue?
Q. Is there a way to increase the timeout value?

Thank you.

warkolm · February 21, 2016, 8:01pm

What version, how much data are you indexing and is in the cluster?

dynamicscope · February 21, 2016, 11:42pm

It's ElasticSearch v.2.2.0

{AWS EC2 m4.xlarge : 4 vCPUs, 16 GiB Mem} x 3

node-1: master/data
node-2: data-only
node-3: data-only

ES_HEAP_SIZE: 10g

Data: 100,000,000 docs
Indices: 13000
Shards: 3 / index
Replica: 1

After a while, master throws the following OOM

[2016-02-21 22:03:47,104][WARN ][monitor.jvm              ] [Hulk] [gc][old][18835][175] duration [3m], collections [5]/[3m], total [3m]/[1h], memory [9.9gb]->[9.9gb]/[9.9gb], all_pools {[young] [266.2mb]->[266.2mb]/[266.2mb]}{[survivor] [33.2mb]->[33.2mb]/[33.2mb]}{[old] [9.6gb]->[9.6gb]/[9.6gb]}
[2016-02-21 22:31:27,605][WARN ][transport.netty          ] [Hulk] exception caught on transport layer [[id: 0x180573ba, /172.31.7.130:40806 => /172.31.9.135:9300]], closing connection
java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.util.CharsRefBuilder.<init>(CharsRefBuilder.java:35)
        at org.elasticsearch.common.io.stream.StreamInput.<init>(StreamInput.java:246)

Is the memory too low?
What's happening inside the master's memory. Data nodes seem working okay.

Christian_Dahlqvist · February 21, 2016, 11:49pm

Shards are not free and carries a certain amount of overhead with respect to memory and file handles. With that many indices, the cluster state is also likely to be quite large and use up a fair amount of memory.

Having 78000 (if I count correctly) shards is way, way too many for a cluster of that size and specification, and will use up a lot of memory. I recommend you rethink your indexing/sharding strategy in order to dramatically reduce the number of shards in the cluster.

Christian_Dahlqvist · February 21, 2016, 11:50pm

It seems this is a duplicate of this issue: What happends if I change master: true, data: true node to master-only node?

dynamicscope · February 21, 2016, 11:50pm

Would this be the same reason why the cluster throws ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [as]) within 30s]?

Christian_Dahlqvist · February 21, 2016, 11:52pm

I would expect this to cause a range of cluster issues.

warkolm · February 22, 2016, 1:10am

Let's continue the discussion here - What happends if I change master: true, data: true node to master-only node?

Topic		Replies	Views
Getting process cluster event timeout exceptions while bulk indexing with error message failure to put mappings on indices Elasticsearch	4	3140	June 20, 2017
Process Cluster Event Timeout Exception Elasticsearch	5	2825	July 6, 2017
Help with process cluster event timeout exception Elasticsearch	2	717	September 3, 2020
Index failed to process cluster event (put-mapping) within 30s Elasticsearch	4	3087	December 15, 2017
Possible causes of Process Cluster Event Timeout Exception Elasticsearch	2	2590	July 5, 2017

[ElasticSearch 2.2.0] I am occasionally getting Process Cluster Event Timeout Exception[failed to process cluster event (put-mapping [as]) within 30s] while bulk indexing documents

Related topics