How to clear disk usage

health status index uuid pri rep docs.count docs.deleted store.size
green open sks QcmAeW2sToa3nqbPF7EVzQ 30 1 3101816690 1219212661 16.8tb 8.4tb

Hi All
I have a most large elasticsearch indices , the cluster has 20 nodes ,each one has 16G RAM and 1T SSD , now my cluster disk use 90% of disk . As you see I have almost 1/3 data deleted ,but now I can't merge it like (_forcemerge?only_expunge_deletes=false&max_num_segments=1&flush=true)
becase when it merge. it will use double disk storage , I have no such nodes to add.
Is there any way I can decrease my disk useage

Have you tried running _forcemerge?only_expunge_deletes=true to get at least some data cleaned up? I would expect this to reduce disk space at least to some extent, but am not sure how much space it would require (it also depends on whether you already have been forcemerging down to a single segment or not).

If you have taken a snapshot it might also be worth considering dropping the replica shards during the cleanup. It will make more space available but severely affect resiliency.

As you are approaching/reaching the limit it does however look like you will need to increase disk space on the nodes or add nodes to properly solve the problem. If you don't you will run into this problem again and at some point it may not be possible to fix it.

_forcemerge?only_expunge_deletes=true this will be danger to my cluster, as I know ,this operation will use double disk storage. my cluster have no space. I want a way to decrease storage with out using other disk storage.

Lucene, which is what powers Elasticsearch, creates immutable segments. Deleting data from an index creates a tombstone record in a new segment and only when the old segment is merged is the space used by the deleted document is cleaned up. Merging triggered by Elasticsearch or forcemerging are therefore the only ways to get rid of deleted data. Any delete will therefore always require additional storage to be available.

If you have lots of small segments some of these may merge without affecting all the others, which may only require a small amount of additional storage. If you however previously have forcemerged down to a single huge segment this will need to be merged in order to delete which will result in a lot more space being needed (likely double).

You therefore need to drop the replica while deleting (temporary workaround) or add more nodes or disk space to the cluster (long term solution).

how to know how much segments My cluseter have?
max_num_segments =? will use less storage ?

It is per index so look at index stats.

No, but if you have multiple segments it may be possible only a few are merged at a time and this will require less additional storage required for the merge, e.g. less that double the index size.

can I cancle the merge commond when my storage is not enough to do this merge work ?

I think you need to resolve the issue as per my previous comments. There is no way around that.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.