Compare 2 indices and find missing documents

I have 2 indices, index_a and index_b.

The 2 indices have documents with the almost the exact same template. index_b has some extra fields which was introduced as part of a new feature, but it also has all the fields already present in index_a.

We have noticed that that index_b is missing about 2000 documents which are present in index_a. We found this out by using the _count API.

Now the question is, is there a way to find out the actual missing documents? Only the missing Ids should also be enough for a start.

Both the indices have a field called member_id which is unique for each document and is the same as the document id, so retrieving only the missing id fields should also be enough.

I cannot compare the index directly to the source database because this data comes from an external API.

You have to do that "manually".

Search for all documents in the first index, get the ids and then run a multiget API call to check.

Thank you for the response. I got a list of ids from the first index. When I try to use the _mget API, I can see for the ids that do not exist in index_b, the response body contains the following response.

{
    "_index": "tm_member5",
    "_id": "0",
    "found": false
}

I am going to write some logic to be able to extract the ids from the objects where found == false

But just wanted to know if the _mget api itself has any input parameter I can use to do that. I looked at the doc but could not find anything of this sort.

I'd use jq for this. It helps to filter a node and print whatever you need.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.