Hi,
I am using ES 5.5 (AWS ElasticSearch service).
I am trying to use a shared ES document to assign ids to multiple machines which have access to this index.
Basically, the document looks like this:
index/my_type/count
{
"count: 4
}
Assume there are 4 instances here.
I would like to assign to machines, unique ids ranging from 0-3 (including 3).
So, I will need to make at least 4 requests to decrement and update the count field.
In fact, I may need more requests since there may be versioning conflicts.
I could use the script based update method, with which the get and decrement will be atomic:
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/docs-update.html
However, I need the actual value after updating, before any other machine has changed it.
Also, I am unable to run the script given in the link (btw, the index is hosted on AWS ElasticSearch).
So, I came up with a alternate method based on the version mechanism.
This is what I did (pseudocode, actual impl is in Java):
while (True) {
doc = get_document("/index/my_type/docCount")
version = extract_version(doc)
count = get_count(doc)
count = count - 1
response = putDocument("/index/my_type/docCount/version=" + version, "{\"count\" : "+ count +"}") // updating the count
if (response.status == 200) {
refreshIndex("/index")
assign_machine_id(count)
break
}
}
As I understand, with this code, at each stage, either you get a version conflict, in which case you try again, else you were able to decrement the counter, so assign yourself that counter value.
And this worked for a while, until today I noticed that for a setup where
the counter was intially 4,
there were 4 machines,
the ids assigned were 3,2, 1 and -1.
When I checked the document that holds the counter, its version was 6 (it was decremented 5 times after creation, which should have been 4 times).
In the application, the decrement function is called in only one place. So, every machine calls it only once.
If it's called multiple times, that might justify a decrement happening from elsewhere.
It seems like for the machine that got -1, the counter was decremented twice, but the first time ElasticSearch decremented it successfully, it still reported a version conflict, so the machine tried to decrement once more.
Is this possible?
Or am I doing something wrong?
Is there something else I can try?
Sorry for the long message, but, I wanted to give as much context as possible.
Thanks in advance!
- Dev