Delete_by_query & _forcemerge doesn't free disk space

Hi,

I used delete_by_query API to delete multiple documents and after that _forcemerge API to remove deleted documents.
However, when I use _forcemerge API it finishes instantly and disk usage is the same. Why my API call doesn't do anything and how can I debug reasons for such behavior? I tried to use debug/trace log levels but not sure which logger to look at and when I enable trace on root logger there is too much logs to find anything useful.
I know I have multiple deleted documents so I tried to use "only_expunge_deletes" and nothing changed. I also tried to force single segment per shard with "max_num_segments" but also, nothing changed.

Any suggestions?

Cluster configuration:
40 data nodes
3 master nodes
ES version 5.6.4
Daily indices (~200GB index size)
shards: 20 primary and 1 replica

What call did you launch exactly?

How much disk space do you have left on the node? Merging will grow disk usage as all merged segments are created before the old ones are deleted.

I used (multiple attempts with different params):

curl -XPOST 'http://localhost:9200/indexname/_forcemerge' -d '{
"only_expunge_deletes": false,
"max_num_segments": 1
}'

There is ~100GB free per node. Index size is ~200GB.
There shouldn't be an issue with free disk, should it? I know that ES v2 executed _forcemerge without checking if there is enough disk

How many nodes do you have in the cluster?

40 data nodes and 3 master nodes

OK, so then the index should take up relatively little space per node.

When I have used it I think I have invoked it like this:

curl -XPOST "http://localhost:9200/indexname/_forcemerge?max_num_segments=1"

Could you try that and see if that makes any difference?

That seems to be a solution :slight_smile: Is this expected behaviour or should both requests be valid?

The documentation does refer to request parameters, so I suspect it might be expected. It would probably be worthwhile adding an example to the docs though.

Great, thanks a lot for the help.
I created a PR on for documentation change https://github.com/elastic/elasticsearch/pull/30113

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.