Comparing data from a RDBMS to Elasticsearch

Hi All,

We have a RDBMS source, from where we are indexing the data into indices in elasticsearch.
For reconcillation, we have run few sum and count aggregations on both the source and destination data, but found few discrepancies.

Since the data is in 100 millions, is there a simple way to compare which data-point is missing or where the mistakes are.

P.S : The composite key of the RDBMS is the _id field of the elasticsearch index

There is no simple way, but this blog post may give you some pointers in the right direction: https://www.elastic.co/blog/elasticsearch-verifying-data-integrity-with-external-data-stores

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.