hi,
I have run a Elasticsearch cluster in prd. it has 3 master node and 4 data node for version 2.1.1
it has a big index ,about 2.5TB and it only has 5 shards. The index is in translog state for a long time during index recovery.The JVM memory usage is very high, and then the GC is repeated, causing the data node to leave the cluster.
{data=false, master=true}], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
and I tried to modify the parameter discovery.ze.fd.ping_timeout to 600s, but found the same error
{data=false, master=true}], reason [failed to ping, tried [3] times, each with maximum [10m] timeout
Is there a way to recover ?