Comparing Two Indices Across Clusters

connellyethan · November 14, 2022, 8:57pm

Hi All,

We are currently attempting to compare two indices across clusters to verify we did not lose any records or fields during the process of cross cluster replication, which is how we chose to do this data migration.

Our previous attempt looked to call the search api on both clusters for the max number of 10k records, then sorting on our unique IDs and then doing a deep equals locally between the two response bodies and then searching the next 10,000 records. However, we are seeing inconsistencies in the sorting on these IDs between the two clusters which are failing our deep equals.

Is there any reason that these sorts could be showing inconsistencies? They are long alphanumeric strings.

Secondly, we were looking at this transform example here:

Is there anything similar to this that we could use across two different clusters? Or perhaps a way to compare them both on the same cluster search? If you guys could point me in the right direction I would greatly appreciate it!

Thanks,
Ethan

warkolm · November 14, 2022, 10:28pm

Welcome to our community!

Are you using _id? Are you defining this yourself?

We aren't all guys, but I would suggest that you look at using CCR as the best option for this.

connellyethan · November 15, 2022, 2:36pm

Thank you! Apologies for the general use of the word

We were using a unique id we gave the records themselves.

So we already performed the CCR successfully and can verify that the counts are the same between the two clusters and the respective indices. What we were hoping to do was a granular comparison at the record and field level to make sure the data is exactly the same between the two clusters.

connellyethan · November 15, 2022, 2:41pm

I also forgot to mention we are currently using elasticserach 7.10.

Christian_Dahlqvist · November 15, 2022, 3:10pm

That version is very old, so I would recommnd upgrading to 7.17 as soon as possible.

Hendrik_Muhs · November 16, 2022, 7:37am

Transform supports CCS, therefore you can use it for this task.

system · December 14, 2022, 7:37am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Synchronization of Index among different ES-Clusters Elasticsearch	4	1420	March 30, 2021
Find difference between two clusters Elasticsearch	1	853	February 28, 2020
Compare documents from two indexes Elasticsearch	4	2582	June 17, 2020
Comparison of data between two indices Elasticsearch elastic-stack-monitoring , elastic-stack-alerting	3	48	October 15, 2025
Merge 2 Clusters with same name Elasticsearch	6	844	April 18, 2023

Comparing Two Indices Across Clusters

Related topics