Increasing scroll size in delete by query


(Kunal Kapoor) #1

Hi,
According the the documentation the delete by query plugin is built upon the scroll and bulk API's. The size of the hits returned by scroll is 10 (by default).
Can this size be increased?? The default setting is taking too long to delete the documents.
If not, is there a faster way to delete the documents?

I am using the following code:-
new DeleteByQueryRequestBuilder(esClient,DeleteByQueryAction.INSTANCE).setIndices(index).setQuery(deleteQuery)
.get()

Thanks


(Nik Everett) #2

Pre-5.0 the parameter is named size, which is confusing. It is documented here. 10 is a very, very small default size. In the transport client you'd do something like DeleteByQueryAction.newRequestBuilder(client).setSource(new SearchSourceBuilder().size(1000)).setIndices(index).setQuery(deleteQuery).get(). It is a bit funky to do, sorry!

We rewrote delete-by-query to use the same infrastructure as reindex and update by query in 5.0. The parameter is named scroll_size. In the transport client it is fairly similar to 2.x.


(Kunal Kapoor) #3

I tried the solution but the setSource method does not take SearchSourceBuilder as an argument. I had to apply toString on the SearchSourceBuilder to get it to work.
But my problem still persists.
I executed the delete query on 110k documents with the size 50k via curl and it could delete only 50k documents. It showed 60k failed.
The next time i hit the query it shows 60k were found but deletion failed.
I cannot the understand why it was able to delete at the first time but not on the second.


(Kunal Kapoor) #4

@nik9000
This did the trick for me:-
DeleteByQueryAction.INSTANCE.newRequestBuilder(client)
.setSource(new SearchSourceBuilder().query(filters).size(10000).toString).setIndices(queryIndex).get()

The deletion is completed within couple of seconds.

Thanks for your help


(Nik Everett) #5

Yikes! I hadn't realized it didn't have an override for the actual source builder.

If this works then great. This whole thing is rewritten in 5.0 and should be much easier to work with.


(system) #6