Delete efficiently


(Ophir Michaeli) #1

Hi all,

I want to delete 1 million documents at a time from a given list of
documents I get. The delete is by query (on one of the documents fields).
I want to understand what is the best practice to do that.
Loop through the million and delete one by one async, or will it cause an
overload on elsdaticsearch, and I should delete X at a time async and wait
till it's
done and then delete additional X. Or is there a batch delete that does
this work more efficiently?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ef1b8d7-9682-4315-af47-ba1d21f535e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #2

If at the end, it will remain a small subset, you could also think of reindexing only those docs in à new index and then remove the old index.

Answering to your question, you should run your query using scan and scroll and start remove docs using a bulk.

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 22 juil. 2014 à 10:40, Ophir Michaeli ophirmichaeli@gmail.com a écrit :

Hi all,

I want to delete 1 million documents at a time from a given list of documents I get. The delete is by query (on one of the documents fields).
I want to understand what is the best practice to do that.
Loop through the million and delete one by one async, or will it cause an overload on elsdaticsearch, and I should delete X at a time async and wait till it's
done and then delete additional X. Or is there a batch delete that does this work more efficiently?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ef1b8d7-9682-4315-af47-ba1d21f535e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/D89039EB-5B33-448B-B061-3081CF0B26C9%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(system) #3