What is the best to validate data was migrated properly when doing a reindex?
We are migrating data from ES 1.7 to ES 5.1. We are planning on using the Reindex from Remote API, passing in a query to limit on subsets of the entire data set.
Our thinking is to save off the data somehow and then compare it. Possible ways we were looking into doing validation:
- Using the pagination API. This page however seems to recommend using the scroll API.
- Using the scroll API
- Using ElasticDump
Alternatives to this include validating subsets of the data rather than the whole data dump, especially if the data is excessively large.
However all of this may be moot if Elastic Search already does some sort of internal validation while migrating data. Or maybe there is another utility available for this type of validation?