Delete by query , refresh=wait_for support?

Grievoushead · October 26, 2018, 7:05am

When I call delete_by_query with ?refresh=wait_for, got this warning:

_delete_by_query/?refresh=wait_for] returned 1 warnings: [299 Elasticsearch-5.4.1-2cfe0df "Expected a boolean [true/false] for request parameter [refresh] but got [wait_for]

Is it supported?, doc's says it is.

And when I call delete_by_query with ?retry_on_conflict=3
It fails - invalid parameter.

I wanted to use these since when I delete records I got them rejected b/c of version conflicts. I read about ?conflicts=proceed but I am not sure if it will cause deleted records to remain. That's why I wanted to add retry first and then wait for.

xavierfacq · October 26, 2018, 7:42am

Hi,

What version are you running

Doc : https://www.elastic.co/guide/en/elasticsearch/reference/6.4/docs-delete-by-query.html

Grievoushead · October 26, 2018, 8:47am

Elasticsearch-5.4.1

Here it says that delete support refresh param:

Does it mean that ?refresh=wait_for supported in update but not in delete?

And why delete doesn't support ?retry_on_conflict=3 ?

I see when it fails with version conflict it has:

retries: {bulk: 0, search: 0} meaning it failed on first try, but in doc's from your link it says it should try up to 10 times.

xavierfacq · October 26, 2018, 9:30am

I think that there is a misunderstanding with the doc:

https://www.elastic.co/guide/en/elasticsearch/reference/5.4/docs-delete-by-query.html

In addition to the standard parameters like  `pretty` , the Delete By Query API also supports  `refresh` ,  
`wait_for_completion` ,  `wait_for_active_shards` , and  `timeout` .

For me, the 3 first parameters are boolean : &refresh=true&wait_for_completion=true...

Grievoushead · October 26, 2018, 11:28am

I see, it looks strange that in this part doc says ?refresh has 3 values (true, false, wait_for)
And for delete it has specific param - ?wait_for_completion=true which I believe same as
?refresh=wait_for.

Anyhow, I have another questions: ?retry_on_conflict=3 param which doesn't work for delete_by_query but works for update.

Also it's not clear why it doesn't do re-try by default as it says in docs for delete_by_query:

_delete_by_query relies on a default policy to retry rejected requests (up to 10 times, with exponential back off)

--Thanks.

xavierfacq · October 26, 2018, 12:10pm

IMO retry_on_conflict is only used when updating a document, in this case, a conflict of version may occure. When you delete a doc, there is no conflic "possible", the doc is deleted.

In the response, "retries" is when the initial request failed (cluster error, timeout, etc..), this is not a version conflict.

bye,
Xavier

nik9000 · October 26, 2018, 2:10pm

I can answer this because most of it is my fault!

retry_on_conflicts isn't supported by delete-by-query and update-by-query because we don't have a mechanism to re-check that the document matches the query after the conflict. It may not. Your only option is to ignore_conflicts and redo the entire request if there are any conflicts. Delete-by-query at least will be much smaller the second time around.

Not supporting wait_for is my fault on both counts. I made wait_for and I made the reindex infrastructure that powers delete-by-query and I didn't link them because it is hard. wait_for hooks pretty deep into the shard to know when a refresh occurs. But delete-by-query functions at a much higher level. It could stick use refresh=wait_for on every bulk request that it sends but that'd slow down every bulk request while it waits for a refresh for each one. Delete-by-query can't hook into the wait_for infrastructure in any other way. So it doesn't. But the docs don't reflect that. I'll fix that.

Grievoushead · October 30, 2018, 11:32am

Hi Nik,
Thank you for your answer.

But it still not clear for me, what should I do when version conflict arise during delete_by
query?

What I did is that I have added _delete_by_query/?refresh=wait_for and it print warnings but version conflicts doesn't occur anymore, so is it actually works ??

Or, it interprets not empty strings as true refresh=true ??

What happens if I use conflicts=proceed param - will it skip deleting because of conflicts and I will need to re-try manually ?

Thank you.

Terry_Quigley · November 7, 2018, 12:06pm

Hi

It isn't clear to me either as I've been having issues with ?refresh=wait_for not doing what I expected from the docs. I have also tried ?refresh=true with similar behaviour (i.e. it appears to do nothing). Is there something else I should try ?

Thanks

system · December 5, 2018, 12:06pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deletions failing until index is refreshed Elasticsearch	0	17	January 16, 2025
Refresh sync or async? Elasticsearch	2	1264	October 9, 2017
409 - version conflict - delete_by_query (API request using C# NEST) Elasticsearch language-clients	7	614	November 15, 2023
Use of wait_until for deleteByQuery in java rest client Elasticsearch	8	1168	October 9, 2021
Elasticsearch delete_by_query 409 version conflict Elasticsearch	9	26962	April 27, 2019

Delete by query , refresh=wait_for support?

Related topics