Elastic data consistency

We are going to have elastic cloud in our corporate data center which we trying to set up.

The question is about data being consistent on different servers – we want to make sure that the data is consistent when different users request it – i.e. if someone requests data during the update they don’t get partially updated data. Or in another scenario if someone requests the data and it goes to one shard and another requests comes after that it might go to another shard which has not been updated yet and has stale data.

From what we understand we can regulate it through consistency setting (described here: https://www.elastic.co/guide/en/elasticsearch/guide/current/distrib-write.html) , if we set it to “all” it will make sure that all the shards are updated before this data starts being served.

The question is – is this a setting we would need to set on our cluster? Are there adverse consequences to doing it and if it there is a better way?

That documentation is for an antiquated version of Elasticsearch, you should definitively not install Elasticsearch 2.x for your company but one of the modern 6.x versions where shard synchronization is much better handled. If I remember correctly an update to a primary shard won't return OK before all the replicas have also been updated.

The "Tracking in-sync shard copies" blog post from February 2017 states:

The primary instructs the active master to remove the IDs of the divergent shard copies from the in-sync set. The primary then only acknowledges the write request to the client after it has received confirmation from the master that the in-sync set has been successfully updated by the consensus layer. This ensures that only shard copies that contain all acknowledged writes can be selected as primary by the master.

So you can safely assume that searches against a replica shard will return the same results as if the search hit the primary shard, they contain the same writes.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.