Shard replication mechanism?

fnord_99 · August 4, 2011, 2:47pm

Hi guys,

I'm playing around a little bit with ES and now have an index with 35 Shards
and 1 Replica that I wanted to scale up to 20 replicas (on 21 nodes). While
performing the replication increase I have observed that it is overall very
slow. The servers are interconnected with bonded Gigabit network cables
within the same switch (giving 2Gbit/server). So I would assume that a
throughput of 100MByte/s between two nodes (!) would be achievable at least.
However, checking IO performance with iotop, the average "replication rate"
for the total cluster (!) is at approximately 20 MByte /sec which is awfully
slow.

I'm interested now in how elasticsearch does replicate from one server to
another. Is there a specific algorithm behind it and how does it work? Why
is the replication rate so slow? Are there any possibilities to increase the
speed?

Thanks already in advance a lot for your help!

Cheers,
fnord

kimchy · August 4, 2011, 5:50pm

First, to scale to 20 nodes you don't need to increase the replica count to
20. You have 35 shards with 1 replica, thats 70 shards to be distributed
across the available machines.

Regarding the throughput that you see, there is throttling in the number of
parallel transfers happening to perform recovery (whcih happens when you add
a replica, where the new shard will recover from the primary shard). The
throttling is there so it won't interfere with ongoing search and indexing
requests.

On Thu, Aug 4, 2011 at 5:47 PM, fnord 99 fnord999@googlemail.com wrote:

Hi guys,

I'm playing around a little bit with ES and now have an index with 35
Shards and 1 Replica that I wanted to scale up to 20 replicas (on 21 nodes).
While performing the replication increase I have observed that it is overall
very slow. The servers are interconnected with bonded Gigabit network cables
within the same switch (giving 2Gbit/server). So I would assume that a
throughput of 100MByte/s between two nodes (!) would be achievable at least.
However, checking IO performance with iotop, the average "replication rate"
for the total cluster (!) is at approximately 20 MByte /sec which is awfully
slow.

I'm interested now in how elasticsearch does replicate from one server to
another. Is there a specific algorithm behind it and how does it work? Why
is the replication rate so slow? Are there any possibilities to increase the
speed?

Thanks already in advance a lot for your help!

Cheers,
fnord