From a GC, memeory point of view which is more efficient? I have multiple Indices in an ES cluster one of which has 19,809,064 documents. the cluster has 8 data nodes, and 3 masters. One shard per node per index. There is an Index with jsut 2,00,000 documents. What is more efficient from heap usage is deleting the entire index or bulk deleting ~10000 documents at a time ?
From docs it seems that deleting an Index is more efficient as an entire segment is dropped, verses deleting a document and segment merging. Is the same stucture followed in memory ?
Deleting a full index is as you say much more efficient that bulk deleting individual documents. Documents are however not kept in memory so I am not sure I understand your last question.
Index structures are cached in memory, not individual documents. Most of this is done at the shard/segment level and is therefore efficiently removed when an index is deleted.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.