Can we get 'Checksum' of an Index

In an upgrade Scenario where I need to verify data in ES Db after upgrade , Is there any API by which I can get Checksum of indexes for an ES Db ....

Elasticsearch does not ship with functionality to hash the full contents of an index that I can think of.
Note two things though:

  1. ES's underlying storage layer based on Lucene does verify the consistency of index files automatically as it use them, so any physical data file corruption will be detected anyway.
  2. When you upgrade ES, the index data (handled by Lucene) itself is not upgraded or transformed in any way so it will not be modified because of an ES version upgrade. This is the reason you will have to reindex some indices if you want to continue using them after a major version upgrade as documented here.

=> I don't think there is a use-case for hashing index contents to verify consistency after upgrade as that consistency is ensured by other means.

Thanks for replying .
Just one more question .. May I know " that consistency is ensured by other means" --> what are those means can you highlight some of them .. are they native to ES upgrade handling or even we can also try those means to verify .... because I couldn't get any way to verify the same ..

Thanks In Advance

Sure, there's nothing special there. As I mentioned:

  1. The upgrade doesn't really modify the data files in the first place so there's pretty much no risk to corrupting the index data silently.
  2. The storage layer itself writes a checksum in the footer of all the data files and checks that checksum's validate during normal ES operation in a few spots. If files got corrupted for whatever reason, then those checksums will be off and the affected shards be failed, again preventing silent corruption.

Hope that helps :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.