We have an index which has been rolloverd many time using an ILM, which also force merge the rollovered indices.
We will never update or write any new document to those rollovers, though we want to delete some documents from them.
Considering the Elasticsearch documents on how Lucene actually deletes a document, it says:
When a document is deleted or updated (= delete + add), Apache Lucene simply marks a bit in a per-segment bitset to record that the document is deleted. All subsequent searches simply skip any deleted documents.
By this explanation, we expect that forcemerged segments will not be changed and total number of segments remains unmodified. though what actually happends is that new segments are created. this is beside the fact that deleting a list of documents is taking very long time. (rollovered indices are relocated from SSD to HDD)
While checking, it seems that deleting a document from an index is itself a kind of Write
operation. since it increments a version count. So, a simple delete operation takes long and also created new segments. (the version will be later used to prevent writing a same document with lower version. though our usecase does not need such a prevention!)
What I am simply looking for is that deleting a document of a rollovered index, does not affect the segments count, meaning to run a delete request without any extra footprint.
how can I do so?
is there any other tricky way to delete docs from rollovered (forcemerged) indices while keeping the segments (and finally performace) unmodified?
does current elasticsearch (7.17) supports such a feature?