Restarting ES on a node

shivam.dixit · March 19, 2018, 8:15pm

https://www.elastic.co/guide/en/elasticsearch/guide/current/distrib-write.html

Consider a cluster as explained above. The default "write" implementation is to replicate the change to the primary and replica shards in sync (before returning a success status to the client).

Let's say we are updating data constantly for index 0 which has replica shards on both Node 1 and Node 2.
As per my understanding, updates will be happening in sync in the primary and replica shards. Now let's say one of the node (Node 2) goes down and then comes back up immediately (but some of the updates were missed). In this case, will the updates succeed by only updating the left-over shards ? But does this mean that the shards will be inconsistent once the node comes back up ?
Or will the updates start failing all together since one replica is missing.

I am assuming there is a third case to this. As soon as the master senses that a node has gone down, it will start replication in another node. But what happens to the shard lying idle in Node 2. Does it become eligible to be tagged as 'dirty data' and hence eligible for deletion? But what happens to writes which were done before the master could sense that there is a node down? I know I am asking a lot of questions (with fallbacks), hence looking for a detailed response.

Christian_Dahlqvist · March 20, 2018, 8:42am

In older versions of Elasticsearch the shard that went offline would be marked as invalid as soon as a write did not succeed on it, and the whole shard would be replaced once it came back up. With the introduction of sequence IDs in Elasticsearch 6.0, this process has been made much more efficient, and it is now possible to recover just missing operations instead of replacing the whole shard as long as the node has not been down too long.

system · April 17, 2018, 8:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch data consistency Elasticsearch	3	1627	December 14, 2016
Write failure handling in elasticsearch Elasticsearch	2	1122	July 5, 2017
Primary and Replica Shard Sync Elasticsearch	12	3735	July 5, 2017
Cluster in Red status: what about write & delete operations? Elasticsearch	3	4334	July 5, 2017
How does Elasticsearch protect from data loss? Elasticsearch	6	3202	July 5, 2017

Restarting ES on a node

Related topics