ES5 -> ES6 -> ES7, Snapshot Restore, Reindex, Index Size increase

Hi,

so I'm migrating ES5 to ES7 by loading indices in ES6, then reindexing them, doing a snapshot, and then loading the snapshot in ES7 and doing another reindex. I have noticed that the index size varies dramatically.

By doing the Reindex from ES5 to ES6, the index sizes reduced in size.
The current index I'm testing has 2.4 GB after restoring in ES7, the same index also has 2.4 GB before doing the snapshot -> all good.
But if I reindex on ES7, the index size changes to 4.2 GB, both indices have the same settings (shards = 5, replica = 1, no additional settings applied).
Additionally the ES7 index has 7666735 docs (same as ES6) and additionally docs.deleted is 1653461. During reindex I noticed that the index size increased further, after all documents were added.

Can someone hint / explain why this happens?
Why does a reindex increase the size by 75% ?
Why are there suddenly docs.deleted?

It seems Reindex produces a lot of segments, which are then merged again after some time.
After waiting around an hour, the index stopped at 3.6 GB in size (was around 7 GB sometimes) but the deleted element count was still high.
Then I did a force_merge?max_segments_num=1 and it removed all the deleted items.

Now I have the same docs count, both indices have 0 docs.deleted, but the es6 index (with force merge) has 2.3 GB and the ES7 index (with force merge) has 2.6 GB - How can that be?

Every segment in each of the indices contains around 1 530 000 documents, in the ES6 index the documents are ~490 MB in size, in the ES7 index they are ~540 MB.

Segments are all version 8.2.0 now. Does that mean, the index was fully upgraded?

The types / fields in both indices are the same.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.