Reindex document count does not match the source

I am reindexing an index from one cluster (elastic 6.8) to another cluster (elastic 7.17)

Source:
GET <index_name>/_count
{
"count" : 827908,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
}
}

Source:
GET _cat/indices/<index_name>
green open <Index_name> 5 1 9904479 1512831 2.3gb 1.1gb

Destination:
GET <index_name>/_count
{
"count" : 827908,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
}
}

Destination:
GET _cat/indices/<index_name>
green open <index_name> 5 1 827906 0 2.1gb 1gb

Could you please explain why there is this huge difference between source and destination when I "cat" the index ? however the "count" is perfectly fine?

Hi @Parvatayya_Malimath

What huge difference? The size difference has several possibilities...

  1. You have deleted documents in the source index that take up space that are not in the destination. This is most likely the difference The counts appear to be the same... when you run count, why _cat/indices are not the same unclear

  2. Did you guarantee the mapping is consistent between the source and destination ... Did you create the destination index with the exact same mapping of the source or did you just run reindex

  3. If the source index has been around for a while elasticsearch may have done some merging (aka compaction in the background).

  4. Between the version there may be some small differences but I suspect 1 is the most likely with also 2 or 3 maybe adding some small variance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.