Hi
I've run into a problem where there's a single process accessing ES to search for some documents, do some local calculations and update them in bulk. A few milliseconds later it is (unlikely) but possible that I need to repeat, do another search and possibly re-update some of the same documents.
The problem is, due to eventual consistency it's perfectly possible that I retrieve an old version of the document and my calculations are bad (I'm not even going into the possible problems of multiple concurrent updates, let's focus on single processing).
I know this is bad design, but what are my options to work around this problem? How can I be sure that I get the latest version of the document always?
The following options that come to my mind is:
-
Do some local doc version controlling, and retry the _search until I get the latest version (which sounds very inefficient)
-
Reduce the commit time of ES (or whatever this is called) so it's more unlikely for this to happen (at the cost of performance)
-
Move all my calculations server side (do massive _updates_by_query) where the server side scripting does all the magic AND ensure that retry_on_conflict is set so the final result is consistent.
I'm tempted to follow my gut and go straight to Option 3 - move my code server side and rely on retry_on_conflict - but perhaps there are other options that I'm missing out?
Appreciate any help! Thanks!