How to defrag an index?


Our use case:

  • pretty big cluster - billions of docs
  • we update documents in place
  • data are not time-sliced as we often do retrieve and modify old documents
  • Issue: over time we accumulated a lot of deleted documents in the indices; it is close to 20%
  • we are on 1.6.x

It turns out that we have a few segments close to 5GB and using default settings elasticsearch doesn't want to merge them.

We'd like to be able to defragment the cluster to avoid wasting space, especially that the number of deleted docs grows over time.

I see two solutions here:

  1. Change the merge policy to something like this

    index.merge.policy.max_merged_segment: 20gb # 5gb is the default
    index.merge.policy.reclaim_deletes_weight: 3.0 # 2 is the default

This should help us right now, but it will really push the issue in time, as when we accumulate 20% of deleted docs in these 20gb segments we'll have the same as right now.

  1. Manually optimize the indices using optimize API _optimize?max_num_segments=1

We can make it a weekly or monthly job, but I'm afraid that the segments will grow unbounded this way and eventually we will kill the cluster performance.

Q1. I guess what we really want is some kind of an _optimize which will turn e.g.

  • 5 * 5gb shards with 20% of deleted
  • into 4 * 5gb shards with 0% deleted

Q2. Is there any other way this usecase should be handled without reindexing?

Q3. Do big shards have any negative impact on the cluster?

Q1: you may want to send an _optimize with only_expunge_deletes=true

Q2: leave deleted documents in the index and filter them out by a criteria at search time, or rearrange your index organization so old/unneeded indices can be dropped

Q3: yes

Thanks Jörg for the answers. Looks like, there is no way to merge 5 shards into 4 shards of similar size, right? You can only merge 5 segments into one big segment?

Each shard is an individual Lucene index. Each index is made up of small
segments, which are immutable and are merged from time to time. The number
of shards cannot be changed once an index has been created.

I cannot see your original question since this "mailing list" does not
always deliver emails.


Instead of shards I meant segments. Edited the post.

Segment merging always goes down to a single segment. I've run into this issue before, btw. Ultimately we lived with the delete overhead.

Calling _optimize can actually make the trouble worse because it makes even bigger segments which the merge scheduler wants to merge even less than it wants to merge the ones around 5gb.

Its something I've talked about with @mikemccand a few times but never came up with a good solution for.

Thanks Guys, so probably we'll also need to leave with some overhead. We'll try at least to understand the merge policy in more details and maybe tweak it a but to much our use case.