Behavior of ForceMerge

Let me ask about the behavior of _forcemerge API including parameter only_expunge_deletes=true.

I'm planning to execute _forcemerge?only_expunge_deletes=true to large index (more over 500GB/10 datanodes) to clean up a disk space.

So my question is, does it API use additional disk space to clean deleted documents, if yes how large does it consume?

As all segments in Lucene are immutable new segments without the deleted documents will be created before the old ones are deleted. A forcemerge can therefore use a fair bit of additional disk space while it is running. I do not know how to predict exactly how much it needs as this depends on the parameters used and how the current segment distribution looks like.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.