I have been doing a moderately large scale restoration from s3 backups made in ES 1.0 into a new cluster running ES 1.7. Two of my indices so far (among several hundred) are failing with index corruption errors during the restoration process. I wasn't able to find anything that helped me to begin to approach this problem (outside of a full reindex of the bad indices which is possible fortunately). The following command seemed hopeful:
java -cp lucene-core*.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /var/lib/elasticsearch/elasticsearch/nodes/0/indices/generic_index/0/index/
Unsurprisingly running this against the bad index (with -fix) ultimately resulted in massive document deletion. Out of curiosity I tried running the same command against other green indices, but it complains about broken segments as well.
Questions:
-
I take it the command above is not the right way to check index integrity?
-
Is putting "index.shard.check_on_startup: true" into elasticsearch.yml reliable in 1.7 even though it is flagged experimental? Restart reports green on a testing subset of my anticipated green indices, which is good news.
-
If the above are not options, I guess we simply expect ES to let us know if something isn't correct? I always assumed this, but my recent issues and the discovery of the checkindex command above got me wondering.
I bet I am being too concerned, but, due to my naivety with lucene, seeing the above command fail against a green index bothered me. If I do have corrupted data that I am unaware of and spans all indices, recovery is going to be a major undertaking.