Hi,
I would like to know if there is some official way of doing soft delete in ES?
Thank you
Hi,
I would like to know if there is some official way of doing soft delete in ES?
Thank you
What do you mean by soft delete?
Not delete directly in storage
but mark the document somehow to not let it to be searchable
similar to add a is_deleted column in mysql
No, there's not.
If you want it to exist but not be available for search then you should do that during index time.
could you please expand on how to do that during index time?
There's a few ways;
https://www.elastic.co/guide/en/elasticsearch/reference/6.3/enabled.html
https://www.elastic.co/guide/en/elasticsearch/reference/6.3/mapping-index.html
sorry, my question is confusing
I mean how to hide one document instead of using delete api
for example, I have 5 docs in the index [doc1, doc2, doc3, doc4, doc5]
I found doc2 is invalid, so I don't want to it to be searchable in business logic. But to for keeping data for audit purpose, i don't want to delete doc2 directly in ES as well.
I am thinking that to have a field is_searchable in the doc, and update it to false for doc2, and when search, add a filter is_searchable = true
but in this method, i have to update all other existing docs the field to true, but in my real usage, i have millions of existing documents.
In this case, do you any better ideas of doing that?
Why not set the field is_searchable
to `true in all documents at index time and then just update the documents that are to not be visible? You can then access this through a filtered index alias. Note that as the documents are still in the index, they will affect relevancy scoring.
You can also do this with document and field level security - https://www.elastic.co/guide/en/elastic-stack-overview/6.3/field-and-document-access-control.html
Just wondering why do you want this?
no specific reason
we usually not do hard delete on storage
thank you for your suggestion
yes we will set is_searchable to true for new document at index time
the problem is updating all existing 60 millions data is not ideal
actually, considering the indexed doc will affect scoring on search, i am hesitate on doing this now
So don't complicate things IMHO.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.