Elastic Search Bulk API (Delete) vs DeleteByQuery

My requirement is to delete a particular document (with doc_id) from multiple indices in one go. But the catch is that I am not aware which all indices would have this doc and I will have to fire a delete request for the doc_id to all indices present in the system.

When I tried to delete via bulk API by passing index and doc id, I was getting request too big error (status code: 429). The only alternative I could think of is to send a single delete request (DeleteByQuery) containing all indices which have to be considered.

My query is that whether DeleteByQuery is costlier than Delete by Doc ID and Index when it touches all the indices present?

Is there any alternative approach I could give a shot?

Cluster Info -

  • 2 data nodes.
  • 318 indices.
  • 1 shard per node.
  • 1 Lakh documents in total (not per client).

Thanks.

Why do you have so many indices for such a small number of documents?

Sorry, made a mistake. So in the worst case scenario i will have 1Lakh records per index.

Assuminh that mean 100,000 documents per index it is still not necessarily a lot of data for an index unless the documents are massive. I would generally aim to have a shard size of 10GB to 50GB. What is your average shard size?

The shard size is around 1 GB. The current index design was required for a particular use case and all optimisations are 'work-in-progress'.
The main issue that i am facing is 'Request too Big' - Status code 429. I was looking into ways to reduce the no of delete requests sent as part of bulk request or sent a deleteByQuery instead having the query pointing to the doc id.

Which one one would be better ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.