UpdateByQuery script version conflict

Hi,
I'm having a lot of version conflict when trying to updatebyquery with a painless script, even though the documents are not updated by anything else at the same time (my script is the only write on the cluster at this time).
Most of the documents that should be updated by this script are indexed a few minutes before by the same process (outside the cluster of course, in PHP), so I tried to call the refresh API before executing the updatebyquery, to be sure they are available for search. I think the refresh API does not wait for the refresh to finish to return, so I even test to add a sleep in my script after the refresh (2 min) to increase the chances of the refresh being finished.

Anyway, almost every time it's working on a lot of data, there is version conflicts. This makes no sense as the refresh seems to be finished and nothing else is indexing at the same time.
My script is updating a lot of fields and adding a nested entry on documents that are children of other docs (parents are not updated in this query), I don't know if this is relevant but I thought it might be linked in some way...

There is probably something I'm missing in how the updatebyquery works, so if anyone as a clue, it will be really appreciated.

Thanks.

I tried to proceed on conflict and retry only on non-updated documents.
For that I updated a date field with the current datetime (passed as parameter) in the painless script, so that all successfully updated documents would have this date.
In case of version conflict, I execute updateByQuery again after adding to it a must_not term filter on the previously passed datetime (= to update only non-updated documents)
Surprise: no documents are updated on the second launch, which logically means that they were all updated on the first launch, besides version_conflicts...
I did a manual query on documents updated on the first date, it turns out that the hits.total equals to the number of documents updated on the first launch + the version conflicts count on the first launch.

How is that possible? If documents were in conflict the first time, how is it possible that they were updated anyway (and not counted as updated but only as version conflicts)?
This version conflict issue seems to only appears on large number of documents (most of the time, I try to updated around 300 000 documents with one updateByQuery), so maybe it is causing a bug?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.