Why would the docs.deleted count in some indices exceed 10 times the docs.count?

wangxr1985 · April 25, 2024, 1:49pm

OS: CentOS 7.8
ES version: 7.5.2 with bundled JDK(openjdk version "13.0.1" 2019-10-15)

The default segment merge parameter index.merge.policy.deletes_pct_allowed is set to 33, so theoretically the docs.deleted count should not exceed half of the docs.count.

In most of my indices, the docs.deleted count follows this rule, but there are some indices where the docs.deleted count significantly exceeds the threshold for segment merging. For example:
{
"docs": {
"count": 104309,
"deleted": 1337471
}

I tried changing the deletes_pct_allowed value to 20, but from the monitoring data, it seems that the docs.deleted count is still being cleared only after reaching the original value, indicating that this parameter may not be taking effect.

There are no nested types in my index mapping.

What could be the reason for this discrepancy?

dadoonet · April 25, 2024, 3:27pm

If you are doing updates, that could explain it.

And update is 1 delete + 1 insert basically...
So the number of deletion might be high if you are doing a lot of updates.

wangxr1985 · April 26, 2024, 6:14am

The index indeed has frequent update operations, but I observed that the docs.deleted cleanup frequency is very low, being cleaned up only every few hours, as shown in the following figure:

What could be the reason for the docs.deleted count significantly exceeding the docs.count and not being expunged for a long time? Are there any parameter settings that can make the expunge of deleted documents more frequent?

dadoonet · April 26, 2024, 7:00am

The Force Merge API could help. See the only_expunge_deletes option.

May be you can control index.merge.policy.expunge_deletes_allowed index setting? It defaults to 10. It's not mentioned in the merge settings documentation though. So may be it's not recommended to change the value...

Christian_Dahlqvist · April 26, 2024, 7:07am

Merging can be an expensive and I/O intensive operation so there is a good reason it is not necessarily performed very aggressively by default. Why are you looking to change this and make it more aggressive? Is it using up too much disk space? Is it affecting the page cache hit rate? Are you seeing a performance impact as the number of deleted documents grow?

wangxr1985 · April 26, 2024, 7:33am

The main purpose is to improve query performance, as a high number of docs.deleted can result in increased CPU usage during queries. For example, in the chart below, each time docs.deleted is expunged, there is a noticeable decrease in CPU usage.

Christian_Dahlqvist · April 26, 2024, 7:43am

The CPU usage does by itself not seem excessive or problematic. Is it resulting in longer query latencies? Are you tracking query latency as function of deleted document percentage? Is there a correlation?

What are the issues around query performance you are seeing?

wangxr1985 · April 26, 2024, 8:09am

The query QPS of this ES cluster is very high, with many data nodes in the cluster. If we can keep the CPU usage as shown in the chart above at 20% and prevent it from rising to 30%, we could potentially reduce the number of servers by close to one-third. Each server is equipped with NVMe SSD drives, so the disk I/O load from segment merging is not a bottleneck.

I aim to configure the cluster to prioritize the cleanup of deleted documents during regular segment merges, maintaining a low count of doc.deleted to keep the CPU usage at a lower level.

Topic		Replies	Views
Too many Deleted Docs Elasticsearch	6	4808	June 7, 2017
Index with heavy updates/deletes, deleted docs on the rise Elasticsearch	2	794	October 5, 2017
Recommended Merge Option Elasticsearch	6	1701	July 6, 2017
Remove deleted documents from large segments Elasticsearch	4	3194	July 5, 2017
ElasticSearch Segment merge not happening when deleted documents count is greater than 50% Elasticsearch	2	1372	July 31, 2021

Why would the docs.deleted count in some indices exceed 10 times the docs.count?

Related topics