Checking Index Integrity


(The Dude) #1

I have been doing a moderately large scale restoration from s3 backups made in ES 1.0 into a new cluster running ES 1.7. Two of my indices so far (among several hundred) are failing with index corruption errors during the restoration process. I wasn't able to find anything that helped me to begin to approach this problem (outside of a full reindex of the bad indices which is possible fortunately). The following command seemed hopeful:

java -cp lucene-core*.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /var/lib/elasticsearch/elasticsearch/nodes/0/indices/generic_index/0/index/

Unsurprisingly running this against the bad index (with -fix) ultimately resulted in massive document deletion. Out of curiosity I tried running the same command against other green indices, but it complains about broken segments as well.

Questions:

  1. I take it the command above is not the right way to check index integrity?

  2. Is putting "index.shard.check_on_startup: true" into elasticsearch.yml reliable in 1.7 even though it is flagged experimental? Restart reports green on a testing subset of my anticipated green indices, which is good news.

  3. If the above are not options, I guess we simply expect ES to let us know if something isn't correct? I always assumed this, but my recent issues and the discovery of the checkindex command above got me wondering.

I bet I am being too concerned, but, due to my naivety with lucene, seeing the above command fail against a green index bothered me. If I do have corrupted data that I am unaware of and spans all indices, recovery is going to be a major undertaking.


(Mark Walkom) #2

There's known issues with lucene corruption in older versions, we added checksums through both Lucene and ES to help prevent this but you can still have the same situation you are seeing.

Reindexing is the best option if you want to ensure data integrity.


(The Dude) #3

Can I at least assume that green indices in ES 1.7 when restarted with the option "index.shard.check_on_startup: true" are in fact not corrupt regardless of which version of ES created them?


(Mark Walkom) #4

I think so, it does run a checkIndex - https://github.com/elastic/elasticsearch/blob/v1.7.2/src/main/java/org/elasticsearch/index/shard/IndexShard.java#L1273


(The Dude) #5

I will take that as the reassurance I need for now. Thanks for pointing that out in the code for me.


(system) #6