Checking Index Integrity

Nathan_F · October 7, 2015, 7:57am

I have been doing a moderately large scale restoration from s3 backups made in ES 1.0 into a new cluster running ES 1.7. Two of my indices so far (among several hundred) are failing with index corruption errors during the restoration process. I wasn't able to find anything that helped me to begin to approach this problem (outside of a full reindex of the bad indices which is possible fortunately). The following command seemed hopeful:

java -cp lucene-core*.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /var/lib/elasticsearch/elasticsearch/nodes/0/indices/generic_index/0/index/

Unsurprisingly running this against the bad index (with -fix) ultimately resulted in massive document deletion. Out of curiosity I tried running the same command against other green indices, but it complains about broken segments as well.

Questions:

I take it the command above is not the right way to check index integrity?
Is putting "index.shard.check_on_startup: true" into elasticsearch.yml reliable in 1.7 even though it is flagged experimental? Restart reports green on a testing subset of my anticipated green indices, which is good news.
If the above are not options, I guess we simply expect ES to let us know if something isn't correct? I always assumed this, but my recent issues and the discovery of the checkindex command above got me wondering.

I bet I am being too concerned, but, due to my naivety with lucene, seeing the above command fail against a green index bothered me. If I do have corrupted data that I am unaware of and spans all indices, recovery is going to be a major undertaking.

warkolm · October 7, 2015, 8:04am

There's known issues with lucene corruption in older versions, we added checksums through both Lucene and ES to help prevent this but you can still have the same situation you are seeing.

Reindexing is the best option if you want to ensure data integrity.

Nathan_F · October 7, 2015, 8:07am

Can I at least assume that green indices in ES 1.7 when restarted with the option "index.shard.check_on_startup: true" are in fact not corrupt regardless of which version of ES created them?

warkolm · October 7, 2015, 9:07am

I think so, it does run a checkIndex - https://github.com/elastic/elasticsearch/blob/v1.7.2/src/main/java/org/elasticsearch/index/shard/IndexShard.java#L1273

Nathan_F · October 7, 2015, 9:08am

I will take that as the reassurance I need for now. Thanks for pointing that out in the code for me.

Topic		Replies	Views
CorruptIndexException after node restart Elasticsearch	5	1039	September 26, 2017
Restore snapshot checksum problem (Troubleshooting corruption) Elasticsearch snapshot-and-restore	15	81	March 31, 2025
Can we get 'Checksum' of an Index Elasticsearch	4	1560	March 4, 2020
Indices not recovering after elasticsearch upgrade (1.0.2 -> 1.4.1) Elasticsearch	2	453	July 6, 2017
Corrupted Shard on Recovery Elasticsearch	10	690	July 6, 2017

Checking Index Integrity

Related topics