My organization uses Elasticsearch to store and index information for one of our services. We have an internal compliance requirement to delete data whenever a user of our service requests their data to be deleted. In order to comply with this, we listen for data deletion events and call the delete API on all documents associated with a user when a data deletion request comes in. However, we recently learned that internally, data is marked for "delete", but is still persisted in storage when the delete operation is called. It is only until an automatic merge operation is run when this data gets fully removed from the cluster. We noticed that the docs.deleted
metric has been steadily rising over the past year in our cluster.
Through researching, we don't think there is any way to recover or search for these "soft delete" documents through an API. However, in order to verify we are in compliance with this legal requirement, we want to understand if there is feasibly any way for these "soft delete" documents to be retrieved or read. Hoping the someone could provide additional clarification here. Thanks!