Elasticsearch 5.5 - Deleted document count above 50%, and low disk watermark consistently hit

Hello All,

I believe I have a storage issue with regards to deleted documents.

I was hoping to leave elasticsearch/lucene do its thing and merge out deletes, but i'm not sure why this isnt happening. Its only a problem as I am now consistently hitting my low disk watermark, meaning replicas will not be assigned.

I've done a force merge expunge deleted, but this is painfully slow (I have a 325GB index, its taken 13 hours and still going (in tasks API).

As to our use case, We have about 15 types, some we update often (Like diary events, think calendar), others are much more static.

Any recommendations where I can look to help me out?

Thanks for your help in advance,
Kind Regards,
Matt

What version are you on?

Sorry, I put the version in the title but didnt mention it in the body.

I am on version 5.5 - (the incarnation before they back-ported the fix for the amazon S3 errors).

Thanks for any help

Matt

If possible it might be beneficial to break out types that are updated frequently into separate indices. That should mean that less data is being merged frequently, which I would expect to speed the process up.

Thank you for your reply - whilst it is certainly not easy for our application to change in that way, I will see if it is something we can entertain. Thank you for this.

Meanwhile, are there any particular cases where segment merging would not take place? The rate of growth I am seeing means that low watermark is always hit. I have read that if deleted doc count is less than 10%, it wont bother (perfectly understandable and fine). Are there any other heuristics that I need to account for? Or any features that I could have enabled/disabled that mean merging takes place less often?

Thanks again,
Matt

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.