Lots of deleted documents above 40%

Hi,
In my index I recently removed many documents, but after 3 weeks I still see that index has above 40% of deleted documents, I know that merger somehow, eventually should clean this, but it's not happened yet and index size is not reduced. I also don't want to make force merge with purge deleted option, as last time I did this my index became very unstable (yellow).
What can be done here to make elastic to clean all deleted docs (reindex is also not an option) ?
Thanks

Can you please post the output of GET /_cat/segments/<index_name>?v

Well, it's really big output to put it here.

Can you please use pastebin or maybe gist?

Please:

You have at least 3 segments there that are really big (~5GB) and most of deleted documents are there. Due to the way the merging algorithm works, it is going to take quite sometime until they get merged.

There are a few causes for this to happen. Do you constantly update and/or delete documents very fast? Did you call forcemerge in the past and kept indexing in it?

For now, the fastest way to fix this is to reindex.

Usually I don't do massive delete but did it few weeks ago.
I didn't call forcemerge on that index.
And reindex it's a really not an option, as index don't holds source field and need to take a origin data and to reindex and it's about 0.5 Peta.

If you want to keep that index and purge the deleted documents, then the only way is by forcemerging to 1 segment and stop writing to this index (i.e. no deletes/indexing/updates), otherwise it will just make it worse.

So, just waiting for merging will not solve them problem ? As I can't stop writing to it. It used by customers.

It's going to take a while and it's very difficult to estimate.

Check the videos in this blog post to get an idea: http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

Thanks, I think to wait is the only option for me. Btw, if I'll decide for purge deleted force merging, can I tell to index to not perform any write operations ?

You can mark an index as read-only. Actually, you can keep writing to that index, the only real issue is if you end up into the same problem by doing a massive delete again.

You should avoid doing massive deletes anyway, if your data has a timestamp it is strongly recommended that you use time-based indices so when you need to delete documents based on a time range, it will be much better.

For setting index read-only set index.blocks.read_only: true

No my index is not time-based.
So, what is the best practice if I do need to perform a massive delete again ?

Setting index to read-only will not prevent from merging ?
Btw, I did once force merge in 1 segment, on big index, and it's caused to index to be yellow and many shards went into recovery. Doing only purge deleted can also cause this ?

Thanks

How did you perform this massive delete? Was it using the bulk API? Or maybe using delete by query?

It is not expected to prevent. Indeed the forcemerge API is generally recommended as a house keeping operation for smaller indices that are considered read-only. You could try to setting to only expunge deletes, it may help, but I don't expect much since you have massive segments with deleted documents.

I used bulk api to delete.
Btw, if I'll delete entire type, will this release the space or the behavior will be exactly the same.
I really afraid of force merge operation as my index is huge.
Thanks

It looks like you are on a very old version of Elasticsearch (1.7 maybe?), because I see lucene version 4.10 in your segments. While I'm not sure if this is the exact issue here, I do remember an issue long ago with deletions alone not triggering merges: https://issues.apache.org/jira/browse/LUCENE-6166.

You should upgrade to a more recent version, but in your case, that will mean reindexing.

There is no difference as there is no physical separation of the types. All types go into the same physical shard.

While doing bulk delete, do you change the index refresh interval? What is the index refresh interval that you use?

Checked, and my default refresh interval is 20 sec, as i don't need real-time index.
Do you think I can make it even bigger ?
Thanks

Yes, my ES is really old, 1.7.5.
The bug you're referenced as I understand is talking about index where only deletions and no new inserts, correct ? So, my case is that we index and heavily.
Upgrading is to reindex and this is something that we can't afford now.

Thanks

This is not a bug. It is simply how Elasticsearch merging works. This happened because you have refresh time set to 20 secs and probably you have used a large bulk size for deletes as well. To avoid problems in the future with massive delete set refresh to 1 sec and use small bulks (around 100-300).

For now you could try forcemerge to 1 segment with expunge delete only. But this might have unwanted consequences, the best thing to do would be reindexing.