Failed to process cluster event (put-lifecycle-fb_test) within 30s

tegerei · March 26, 2024, 1:12pm

I am running a 3 node cluster for Elasticsearch. I get the error below when I try to create an ILM policy

org.elasticsearch.transport.RemoteTransportException: [es3][192.168.10.52:9300][cluster:admin/ilm/put]
Caused by: org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-lifecycle-fb_test) within 30s
	at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:158) ~[elasticsearch-7.17.9.jar:7.17.9]
	at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?]
	at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:157) ~[elasticsearch-7.17.9.jar:7.17.9]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) ~[elasticsearch-7.17.9.jar:7.17.9]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
	at java.lang.Thread.run(Thread.java:1589) [?:?]

The following is my cluster details:

{
  "cluster_name" : "ess",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 2448,
  "active_shards" : 2503,
  "relocating_shards" : 0,
  "initializing_shards" : 4,
  "unassigned_shards" : 2120,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 134,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 1143201,
  "active_shards_percent_as_number" : 54.09552625891506
}

I have tried to restart the cluster, but every time I try to perform a synced flush, I get error 502 after 30 seconds.

dadoonet · March 26, 2024, 2:57pm

The cluster is red so you might need to wait for the missing primary shards to be recovered.

tegerei · March 27, 2024, 11:52am

Hello,

Even when the cluster health is Yellow, I still get the same error.

Recently I also get these errors on the cluster.

[2024-03-27T12:49:42,962][WARN ][o.e.g.PersistedClusterStateService] [es3] writing cluster state took [15006ms] which is above the warn threshold of [10s]; wrote global metadata [false] and metadata for [4] indices and skipped [1598] unchanged indices

Christian_Dahlqvist · March 27, 2024, 11:57am

What is the hardware specification of the cluster? What type of storage are you using?

tegerei · March 27, 2024, 12:06pm

I have 3 nodes as follows:
node1: 64 GB RAM, 32G heap size, 15 TB nvme SSD, 8 cores
node2: 128GB RAM, 64GB heap size, 15 TB nvme SSD,12 cores
node3: 128GB RAM, 64GB heap size, 15 TB nvme SSD, 12 cores

All the nodes are master eligible.

tegerei · March 27, 2024, 12:11pm

Additional info

shards disk.indices disk.used disk.avail disk.total disk.percent host            ip              node
  1390        3.4tb     5.5tb      8.8tb     14.4tb           38 192.168.10.52 192.168.10.52 es1
   192      103.1gb     3.3tb       11tb     14.4tb           23 192.168.10.81 192.168.10.81 es2
  1344          5tb     6.1tb      8.3tb     14.4tb           42 192.168.10.25 192.168.10.25 es3
  1703                                                                                           UNASSIGNED

system · April 24, 2024, 12:12pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index failed to process cluster event (put-mapping) within 30s Elasticsearch	4	3042	December 15, 2017
Failed to process cluster event (put-mapping) within 30s Elasticsearch	4	8683	November 30, 2020
Failed to process cluster event (put-mapping) Elasticsearch	5	10355	October 26, 2018
Getting exception Process ClusterEvent Timeout Exception after 5 minutes Elasticsearch	3	372	October 14, 2019
Process Cluster Event Timeout Exception on put-mapping Elasticsearch	12	10023	May 31, 2018

Failed to process cluster event (put-lifecycle-fb_test) within 30s

Related topics