I have a Elasticsearch 5.5 cluster consists of 8 nodes (all the same config of both master and data nodes) and each node contains about 60 shards and each shard contains about 15GB data.
I run into the problem that if I restart one node, when the node rejoined the cluster, the data in that node is suddenly all gone and the cluster then started relocating shards to this node as if this is a new node, and the process took many hours to complete.
I tried to flush sync before the restart, but the problem persists. the flush sync api call is not full success because one index is accepting writing operation all the time.
I don't think it is the normal behavior when a node restart, but I don't know where I did wrong, any idea?
oh, that's sad to hear..
can you suggest a version that "handle recoveries from restarts very gracefully" ? will v6.8 suffice?
I hope to keep the changes as minimal as possible to my working application
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.