Clarifying the difference between Active shards and in-sync copies


The replication docs state

a single in-sync copy is sufficient to serve read requests

and then, when describing the read model, the same doc states that the basic flow

Select[s] an active copy of each relevant shard

This suggests that in-sync copies and active copies are the same thing.

But this blog post states

one of the active replicas, which is also in the in-sync set

which clearly implies that there is a difference between the two.

Confusing matters (for me at least) is the wait_for_active_shards param on the Index API, which makes no reference to in-sync copies.

Are active copies and in-sync copies the same thing? If not, could you please clarify the difference?

Many thanks


ES support confirmed that they are effectively the same; I think I mis-parsed the quote from the blog post above. However, it is possible that a shard could be in-sync but not active e.g. when a green index has been closed.

A shard copy can also be in-sync but not active if it's unassigned (e.g. the node holding it has shut down) but nothing has been written to that shard since the last time that copy was active. The converse is not possible, however: all active shards are necessarily in-sync.

a single in-sync copy is sufficient to serve read requests

I think that's wrong, the copy has to be active to serve read requests. If it's closed or unassigned then it obviously can't serve searches, even if it's still in-sync.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.