Hello.
I guess questions about read/write consistency in Elasticsearch is a never ending story. I've been ES user for at least 7 years and still struggle with those.
My team uses Elasticsearch 6.8, we will update it eventually, but this is not an option at the moment.
Our cluster consists of 4 nodes, index has 2 shards and 2 replicas (3 total for each), all 4 nodes are master
, data
and ingest
nodes simultaneously, not the best setup, I know.
So here's our case and additional questions I'd like to clarify.
- If we do update with
wait_for
refresh strategy what this actually means? Will response be returned once document is:
- written and refreshed only on primary shard
- written, refreshed on primary and written to replicas
- written, refreshed on primary and written, refreshed on replicas
- something else.
Also refreshed in what way: internal or external?
-
If we call
_refresh
explicitly will it refresh all shards, primary and replica across cluster before response or only shards on the node I'm making request to? -
We use update with
wait_for
on one of our services then a test polls document using GET API and receives updated state, proceeds and calls another service. That another service gets document with GET API again and receives previous version of a document and this breaks our business logic.
We tried different options for GET API request:
realtime=true
, though it is realtime by default it uses internal refresh and as I understand may still be inconsistent on replica shardrefresh=true
, it should have refreshed document before getting it, but still returns previous versionpreference=_primary
, we assumed that if we usewait_for
refresh for update and we already got the updated document then primary shard for that document should be refreshed and consistent, but still the same problem.
We have aws ELB before all nodes to balance load and I though it might cache some responses, but my devops told me that is doesn't cache anything at all.
Thanks, all answers much appreciated.