Need Help / Suggestion with Deleted Documents

apmalik80 · February 22, 2016, 1:40pm

Currently one of the index that we have shows the following stats

The stats show that the total documents count is 2081 whereas a considerable amount of documents are marked as deleted. This has happened I think due to luncene engine's default settings.

We can reclaim the space using expunge deletes option but is it possible to somehow unmark / restore the deleted documents in another cluster or within the same cluster.

jpountz · February 22, 2016, 6:36pm

Can you let us know what version you are running and whether you have errors in the logs (eg. failed merged).

apmalik80 · February 23, 2016, 10:21am

Hi Jpountz,

The index was previously created on ElasticSearch Version 1.4, recently the ElasticSearch was upgraded to 1.7 after I took over the work.
I have checked with the status of the index and it states no-merges have taken place as shown in the diagram below.

I have tried to check the logs but for this index there are no logs detailing any error.

However, while going through different statuses - I think i need to upgrade the index. I am sorry a bit new to this advanced configurations of elasticsearch as previously we were using basic configurations with small tweaking. But now are thinking of moving to production so looking into these issues.

Underneath are the detail screenshots of the index.

If you see in the screenshot in couple of another the same issue has happened - One more addition to it in all these cases the Index has one 1 shard.

jpountz · February 24, 2016, 5:44am

@mikemccand Does it ring a bell to you?

Christian_Dahlqvist · February 24, 2016, 5:55am

It is interesting to see that the number of deleted documents are almost consistently around 10000 times the number of existing documents per the count. Is this from a test or production environment? Are you using TTL? Are you perhaps updating or indexing the same documents repeatedly with the same IDs, resulting in deleted documents as a side effect of updates?

mikemccand · February 24, 2016, 9:23am

Is it possible you did a bunch of deletes and didn't add any new documents
to this index?

Lucene previously would fail to trigger a merge in that case:
https://issues.apache.org/jira/browse/LUCENE-6166

This is fixed in Lucene 5.3 / ES 2.1.0.

If you add one document to the indices does that trigger a merge?

Mike McCandless

apmalik80 · February 26, 2016, 8:40am

@Christian_Dahlqvist yes you are correct the deleted documents are as per you have mentioned. This is from the test environment as I am pushing one copy to test environment but I am now faced with this issue because we have to run some analysis on all the information collected and with documents marked as deleted it seems that most of it is lost. I have checked the original program from which this index was generated and no the documents cannot have same ids.

@mikemccand no the documents were not deleted and now new documents were added. My initial view was also the same it has something but did not come across this patch thanks for it. Bytw is it okay if I trigger a merge manually directly on lucene, will I be able to claim part of the documents?

Thanks @Christian_Dahlqvist, @mikemccand
for your comments.

Topic		Replies	Views
Index not merging/removing deleted documents Elasticsearch	2	378	May 2, 2017
Elasticsearch 5.5 - Deleted document count above 50%, and low disk watermark consistently hit Elasticsearch	5	373	August 2, 2018
Index with heavy updates/deletes, deleted docs on the rise Elasticsearch	2	763	October 5, 2017
Remove deleted docs Elasticsearch	3	555	May 14, 2017
Not able to get back the space allocated by indices after deleting Elasticsearch	5	2135	March 1, 2018

Need Help / Suggestion with Deleted Documents

Related topics