A new replication type: physical replication

We discussed this as a team again today. The performance points raised above are significant:

  • Elasticsearch is deployed in a very wide variety of environments and trading off CPU against bandwidth is not going to work everywhere. For instance inter-AZ traffic on AWS carries a financial cost, so the extra bandwidth consumed by segment-based replication would add to the costs of running an Elasticsearch cluster there that may well not be offset by the CPU savings. Furthermore those extra costs are very hard to predict because merges are not very predictable, and merges can now have a cluster-wide impact on performance rather than these effects being isolated to each node.

  • The extra time it adds to refreshes would not be acceptable to some users.

Some other points were also raised:

  • The allocation of primaries would have to be balanced across the cluster, because primaries would be doing more work than the replicas. Today there is no such balancing algorithm. Additionally, for this balancing we would need to be able to demote a primary back to being a replica which is not possible to do gracefully today.

  • Because different versions of Elasticsearch use different Lucene versions, it would be challenging to support mixed-version clusters if the segments were being replicated.

  • There are some advantages to refreshes occurring in a more coordinated fashion, but it would also add quite some complexity.

  • One can achieve a similar sort of load profile by performing indexing with number_of_replicas: 0 and then adding replicas once indexing is complete, because the recovery of a brand-new replica operates mostly on the segments themselves.

It was quite a long and interesting discussion, and a number of advantages of segment-based replication were identified too, but on balance the conclusion was that this wasn't a path we expect to follow in the foreseeable future.

3 Likes