How are write failures handled in elasticsearch, particularly cases where a write op succeeds on primary, but one or more replicas fail to respond (due to network or any other issue)?
Will the write/update stay on replicas where it succeeded? Even in cases where quorum is not met? How will it impact subsequent searches?
Hi @Vikas_Kumar,
as you mention quorum
, I guess you refer to the action.write_consistency
setting. quorum
is the default, so if less than a quorum of the replicas succeeds, the write is not successful (see a few more details in the Definitive Guide). However, note that this is actually just a pre-check before the actual replication takes place.
We will keep writes on replicas that succeed and get a new replica node assignment from the master if one of the replicas fails. After a new replica node is assigned, the shard in question is synced to the new replica.
However, consistency in distributed systems is a hard problem and we document known edge cases and Elasticsearch's behavior / the status of our fixes on the resiliency page.
Daniel