Size reduction after re-indexing from v7 to v8

Ian_Simpson · July 20, 2023, 12:24am

I'm migrating from a v7.17.10 cluster to v8.8.2 using the reindex API. I'm seeing that some indices with data from ingested files (docx, pdf mostly) only take up 5% of the space that they used to. I've done a little spot checking and it seems that the data is indeed searchable, so I'm wondering why this may be. The normal data seems to take up about 80-90% of the space that it used to.

Is this expected? I'm struggling to understand how this is possible, because I don't even think that compression on the files themselves would yield such a savings. I would think that if the extra space were from my own dirty data then it would just translate straight over after the reindex.

If this is expected, about what final size should I plan for based on the v7 size?

system · August 17, 2023, 12:25am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
After reindex, new index has same documents but data size is much less Elasticsearch	4	575	July 21, 2021
Elasticsearch 7.x consumes more space for indexes Elasticsearch	5	614	March 3, 2020
Space consumption after reindex Elasticsearch	4	307	May 10, 2021
Store.size decreased drastically after reindexing data from 5.6 ES cluster to 7.6 ES cluster Elasticsearch	8	548	April 11, 2020
ES5 -> ES6 -> ES7, Snapshot Restore, Reindex, Index Size increase Elasticsearch	2	1083	May 7, 2020

Size reduction after re-indexing from v7 to v8

Related topics