Hi, I have a situation where I take the following steps
- retrieve lock
- delete by query
- refresh
- insert documents
- release lock
These steps could happen in quick succession. The delete by query is supposed to find all the documents previously inserted and delete them all. For the most part, this has been working totally fine. However, I just discovered an incident where after these steps are complete, there are more documents than expected. I haven't been able to replicate this, but came up with the following hypothesis
Consider the following example with refresh interval set to 1s:
Iteration 1:
- retrieve lock
- delete by query (nothing to delete)
- refresh
- insert documents with IDs 1, 2, 3, all with a property X with value of Y
- release lock
Immediately after (less than 1s), Iteration 2 occurs:
- retrieve lock
- delete by query for all documents with property X with value of Y
- refresh
- insert documents with IDs 2, 3, 4, all with a property X with value of Y
- release lock
The final state I expected is to have only documents with IDs 2, 3, 4, but I ended up with documents with IDs 1, 2, 3, 4
Of course there may be other issues with my code, but my suspicion is that since there was no refresh performed between Iteration 1 and Iteration 2, and no refresh happened due to the 1s refresh interval not being reached, the delete by query failed to find and delete the documents with IDs 1, 2 and 3. Then when the insert phase of Iteration 2 happens, IDs 1, 2 and 3 already exist, so IDs 2 and 3 are updated and ID 4 is inserted.
Is this a possible explanation for my unexpected state?
Something that concerns me is that this hypothesis seems to contradict the following post: Elasticsearch delete_by_query 409 version conflict where a 409 error is expected instead. I did not receive any 409 errors in the example provided. Also, in the post, it's mentioned that the 409 error is due to the lack of refresh between the insert and delete by query. But if there's no refresh, that means the documents cannot be searched, so how can the delete by query find the document to delete in the first place?
Any insight would be greatly appreciated. Thanks!