Elasticsearch 1.5.2 master unresponsive

ananth · June 11, 2015, 10:20am

Hi,

Small intro about our cluster.

2 dedicated master nodes , 30 data nodes , 15 client nodes. 5000 indices . 28000 shards.

We initially had 30 data nodes 32GB RAM , 2 TB disk , 16 core cpu machines. We hit OOM frequently thus we bought 26 High End Machines (128GB RAM, 1TB x 4 disks , 32 core cpu) and added into our es cluster . Cluster runs perfectly ~40TB in size with 56 data nodes. Our aim is to replace the low end machines with high end machine. so we decommissioned low end machine one by one (we remove 2 or 3 machines per day). We removed 26 machines thus now we have 30 data nodes (26 high end & 4 low end machines) . We thought of migrate to es-1.5.2 from es-1.32 and we updated the same. No issues for 3 days.

From yesterday onwards unable to create index / delete index , on seeing master logs it only logs ProcessClusterEventTimeoutException for any task . We have only 200 pending tasks . We create 1000 shards(200 indices) per day . It only creates 200 shards for this it takes 3+ hours still master logs ProcessClusterEventTimeoutException only .

Our Zen properties are
discovery.zen.ping.multicast.enabled: false
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.unicast.hosts: [ip1,ip2]
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 5

Any suggestions welcome.

Topic		Replies	Views
New Elasticsearch 7.6.0 cluster eventually becomes unresponsive Elasticsearch	3	369	April 13, 2020
Cluster failures Elasticsearch	2	284	July 6, 2017
Elasticsearch index creation high master timeout causing index creation retries internally Elasticsearch	2	845	July 5, 2017
Possible causes of Process Cluster Event Timeout Exception Elasticsearch	2	2563	July 5, 2017
Elasticsearch cluster instability Elasticsearch	13	2821	July 6, 2017

Elasticsearch 1.5.2 master unresponsive

Related topics