Any suggestion way of doing soft delete in ES


(Lan Qi) #1

Hi,

I would like to know if there is some official way of doing soft delete in ES?
Thank you


(Mark Walkom) #2

What do you mean by soft delete?


(Lan Qi) #3

Not delete directly in storage
but mark the document somehow to not let it to be searchable
similar to add a is_deleted column in mysql


(Mark Walkom) #4

No, there's not.
If you want it to exist but not be available for search then you should do that during index time.


(Lan Qi) #5

could you please expand on how to do that during index time?


(Mark Walkom) #6

There's a few ways;
https://www.elastic.co/guide/en/elasticsearch/reference/6.3/enabled.html
https://www.elastic.co/guide/en/elasticsearch/reference/6.3/mapping-index.html


(Lan Qi) #7

sorry, my question is confusing
I mean how to hide one document instead of using delete api
for example, I have 5 docs in the index [doc1, doc2, doc3, doc4, doc5]
I found doc2 is invalid, so I don't want to it to be searchable in business logic. But to for keeping data for audit purpose, i don't want to delete doc2 directly in ES as well.

I am thinking that to have a field is_searchable in the doc, and update it to false for doc2, and when search, add a filter is_searchable = true
but in this method, i have to update all other existing docs the field to true, but in my real usage, i have millions of existing documents.

In this case, do you any better ideas of doing that?


(Christian Dahlqvist) #8

Why not set the field is_searchable to `true in all documents at index time and then just update the documents that are to not be visible? You can then access this through a filtered index alias. Note that as the documents are still in the index, they will affect relevancy scoring.


(Mark Walkom) #9

You can also do this with document and field level security - https://www.elastic.co/guide/en/elastic-stack-overview/6.3/field-and-document-access-control.html


(David Pilato) #10

Just wondering why do you want this?


(Lan Qi) #11

no specific reason
we usually not do hard delete on storage


(Lan Qi) #12

thank you for your suggestion

yes we will set is_searchable to true for new document at index time

the problem is updating all existing 60 millions data is not ideal

actually, considering the indexed doc will affect scoring on search, i am hesitate on doing this now


(David Pilato) #13

So don't complicate things IMHO.


(system) #14

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.