Write failure handling in elasticsearch


(Vikas Kumar) #1

How are write failures handled in elasticsearch, particularly cases where a write op succeeds on primary, but one or more replicas fail to respond (due to network or any other issue)?

Will the write/update stay on replicas where it succeeded? Even in cases where quorum is not met? How will it impact subsequent searches?


(Daniel Mitterdorfer) #2

Hi @Vikas_Kumar,

as you mention quorum, I guess you refer to the action.write_consistency setting. quorum is the default, so if less than a quorum of the replicas succeeds, the write is not successful (see a few more details in the Definitive Guide). However, note that this is actually just a pre-check before the actual replication takes place.

We will keep writes on replicas that succeed and get a new replica node assignment from the master if one of the replicas fails. After a new replica node is assigned, the shard in question is synced to the new replica.

However, consistency in distributed systems is a hard problem and we document known edge cases and Elasticsearch's behavior / the status of our fixes on the resiliency page.

Daniel


(system) #3