now after starting node2 the deleted index is back:
curl -XGET node1:9200/_cat/indices
green open testindex1 5 1 0 0 898b 503b
green open testindex2 5 1 0 0 970b 575b
this is fixed in 5.0 - in previous versions we import dangling indices to prevent data loss. This can lead to those situations. I think 2.x doesn't have a setting for this anymore and imports automatically.
Elasticsearch automatically imports indices that it finds on disk and are not part of the cluster state. This was introduce to protect users that bring their cluster down, spin up new master nodes (thinking that their data is safe on their data nodes) and now have the cluster state empty. Another option is to have a new master node added to the cluster and since the cluster is not properly configured, it will be elected to master and use it's own cluster state for the cluster, resulting in an empty cluster state again. Since index deletion is implemented by removing the index from the cluster state, the nodes will interpret that as a "delete all operations" and caused all data to be gone. It is of course poor practice to throw away the data folder of any node (master or not) but people did and the results are disastrous. Instead the data nodes notify the master they have data and the master reimports it.
This behavior has the down side that you discovered, namely that if an index is deleted while a node is offline it will be reimported when the node comes back. The good news here is that we recently introduces the notion of an index tombstone to deal with exactly this. See https://github.com/elastic/elasticsearch/pull/17265
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.