Delete by query , refresh=wait_for support?

When I call delete_by_query with ?refresh=wait_for, got this warning:

_delete_by_query/?refresh=wait_for] returned 1 warnings: [299 Elasticsearch-5.4.1-2cfe0df "Expected a boolean [true/false] for request parameter [refresh] but got [wait_for]

Is it supported?, doc's says it is.

And when I call delete_by_query with ?retry_on_conflict=3
It fails - invalid parameter.

I wanted to use these since when I delete records I got them rejected b/c of version conflicts. I read about ?conflicts=proceed but I am not sure if it will cause deleted records to remain. That's why I wanted to add retry first and then wait for.

Hi,

What version are you running

Doc : https://www.elastic.co/guide/en/elasticsearch/reference/6.4/docs-delete-by-query.html

Elasticsearch-5.4.1

Here it says that delete support refresh param:

Does it mean that ?refresh=wait_for supported in update but not in delete?

And why delete doesn't support ?retry_on_conflict=3 ?

I see when it fails with version conflict it has:

retries: {bulk: 0, search: 0} meaning it failed on first try, but in doc's from your link it says it should try up to 10 times.

I think that there is a misunderstanding with the doc:

https://www.elastic.co/guide/en/elasticsearch/reference/5.4/docs-delete-by-query.html

In addition to the standard parameters like  `pretty` , the Delete By Query API also supports  `refresh` ,  
`wait_for_completion` ,  `wait_for_active_shards` , and  `timeout` .

For me, the 3 first parameters are boolean : &refresh=true&wait_for_completion=true...

I see, it looks strange that in this part doc says ?refresh has 3 values (true, false, wait_for)
And for delete it has specific param - ?wait_for_completion=true which I believe same as
?refresh=wait_for.

Anyhow, I have another questions: ?retry_on_conflict=3 param which doesn't work for delete_by_query but works for update.

Also it's not clear why it doesn't do re-try by default as it says in docs for delete_by_query:

_delete_by_query relies on a default policy to retry rejected requests (up to 10 times, with exponential back off)

--Thanks.

IMO retry_on_conflict is only used when updating a document, in this case, a conflict of version may occure. When you delete a doc, there is no conflic "possible", the doc is deleted.

In the response, "retries" is when the initial request failed (cluster error, timeout, etc..), this is not a version conflict.

bye,
Xavier

I can answer this because most of it is my fault!

retry_on_conflicts isn't supported by delete-by-query and update-by-query because we don't have a mechanism to re-check that the document matches the query after the conflict. It may not. Your only option is to ignore_conflicts and redo the entire request if there are any conflicts. Delete-by-query at least will be much smaller the second time around.

Not supporting wait_for is my fault on both counts. I made wait_for and I made the reindex infrastructure that powers delete-by-query and I didn't link them because it is hard. wait_for hooks pretty deep into the shard to know when a refresh occurs. But delete-by-query functions at a much higher level. It could stick use refresh=wait_for on every bulk request that it sends but that'd slow down every bulk request while it waits for a refresh for each one. Delete-by-query can't hook into the wait_for infrastructure in any other way. So it doesn't. But the docs don't reflect that. I'll fix that.

Hi Nik,
Thank you for your answer.

But it still not clear for me, what should I do when version conflict arise during delete_by
query?

What I did is that I have added _delete_by_query/?refresh=wait_for and it print warnings but version conflicts doesn't occur anymore, so is it actually works ??

Or, it interprets not empty strings as true refresh=true ??

What happens if I use conflicts=proceed param - will it skip deleting because of conflicts and I will need to re-try manually ?

Thank you.

Hi

It isn't clear to me either as I've been having issues with ?refresh=wait_for not doing what I expected from the docs. I have also tried ?refresh=true with similar behaviour (i.e. it appears to do nothing). Is there something else I should try ?

Thanks