Shard routing and consistency in Elasticsearch

Hello everyone,

I have two questions regarding comprehension of shard routing and consistency in Elasticsearch.

I read this documentation Reading and Writing documents and made the conclusion that Elasticsearch is eventually consistent with read-your-own-writes. Meaning that if a client makes a write request to Elasticsearch, it will always read the just written value, since the primary shard will only acknowledge the write operation if all replica shards confirmed the write. Another client on the other hand, that performs a concurrent read operation on the same value might or might not see the most recent value, depending on if his request was routed to the primary, which holds the latest value or a replica that has not finished the write operation yet.

Is this assumption correct?

I read this too and Jörg Prante comes to the same conclusion, but since it was posted in 2014 I'm not sure if it is still up to date.

My second question:
Regarding the PACELC theorem, because of replication behavior and the assumption above, I assume that Elasticsearch chooses availability in case of partition and else low latency.

Is this correct, too?

Stay healthy,
Ole

Not really, no. Documents are only made visible to searches after a refresh, so (a) you need to refresh to read your own write, and (b) replicas may be ahead of the primary if they refresh first. It's basically a mistake to distinguish primaries and replicas when talking about these kinds of questions, you should consider them as interchangeable.

Not really, no; for starters the CAP theorem doesn't really apply since Elasticsearch isn't a linearisable register, but also Elasticsearch leans more towards consistency over availability for writes. Basically it's complicated.

2 Likes

Thanks for your answer, that made it clearer.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.