Hi all,
A quick question about bulk requests. We have some code that we suspect is occasionally including an index and a delete operation for the same document in the same bulk request. The order of the operations dictates what we expect to happen — deletes after indexes should result in a deletion, and indexes after deletes should result in the object being indexed.
I’m wondering how ES is supposed to behave when that happens: I can’t find documentation to give a specification. Experimenting locally with a script and clusters of 1-3 nodes, I can see that the ordering does seems to matter: a bulk update that contains [delete doc n, update doc n] leaves the doc in the index, whereas [update doc n, delete doc n] removes it. I can run that in a tight loop and it seems deterministic.
But, is it? There’s nothing in the documentation for bulk updates that I can see that defines this behaviour (we’re running an older version of Elasticsearch, 7.8, but the docs for them IIRC are similarly unclear). And, at a glance (I am absolutely not familiar with the ES codebase, but very willing to spelunk!), there’s nothing in the bulk request handler code that suggests any validation is done on the request, which might leave the order of operations up to the network or some other race condition.
Does anyone know where I might find a categorical answer to this question?
Thanks for your help!