Which is more efficient deleting an Index, or chunk wise bulk delete of docs

vmandke · June 7, 2018, 7:51am

ES (1.7.5)

From a GC, memeory point of view which is more efficient? I have multiple Indices in an ES cluster one of which has 19,809,064 documents. the cluster has 8 data nodes, and 3 masters. One shard per node per index. There is an Index with jsut 2,00,000 documents. What is more efficient from heap usage is deleting the entire index or bulk deleting ~10000 documents at a time ?

From docs it seems that deleting an Index is more efficient as an entire segment is dropped, verses deleting a document and segment merging. Is the same stucture followed in memory ?

How does ES store the documents in memory ?

Christian_Dahlqvist · June 7, 2018, 8:02am

Deleting a full index is as you say much more efficient that bulk deleting individual documents. Documents are however not kept in memory so I am not sure I understand your last question.

vmandke · June 7, 2018, 8:04am

In memory, as in I meant are the documents cached for search queries ?

Christian_Dahlqvist · June 7, 2018, 8:07am

Index structures are cached in memory, not individual documents. Most of this is done at the shard/segment level and is therefore efficiently removed when an index is deleted.

vmandke · June 7, 2018, 8:11am

Ohhk thanks! This helps a lot. Can you please point me to the docs regarding the internal structure / caching, so I can read more about it ?

Christian_Dahlqvist · June 7, 2018, 8:16am

I do not think the documentation goes into great depths, but have a look at:

https://www.elastic.co/guide/en/elasticsearch/reference/1.7/index-modules-cache.html

https://www.elastic.co/guide/en/elasticsearch/reference/1.7/index-modules-shard-query-cache.html

Christian_Dahlqvist · June 7, 2018, 8:17am

Elasticsearch 1.7.5 is now very old, so I would also recommend that you upgrade...

system · July 5, 2018, 8:17am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.