Distributed Search / Replication - some questions for a better understanding


(maho) #1

hi,

  1. Does Elasticsearch use a Master-/Slave Replication or a Multi-
    Master Replication?

  2. If Elasticsearch uses Master-/Slave Replication: How can i know
    which instance is the master?

  3. Has Elasticsearch an integrated load balancer - If yes, does it
    matter which instance i use for queries?

Thanks.


(Shay Banon) #2

A cluster in elasticsearch elects a single node as the master node. The master node is only responsible for cluster level operations and reacting to cluster changes (initiate shard rebalancing when a snode comes and goes for example).

Each index is broken down into shards. Each shard can have 0 or more replicas. When an index is created and shards are allocated, a primary shard between a shard replication group is elected. Primary shards are used for "dirty" operation since they funnel into them and the replicates to the replicas. When a node holding a primary shard is removed, one of its replicas will become the primary, and reallocation will happen.

Load balancing wise, if you hit the same node in the cluster with search requests, then they will be spread across all the shards and replicas. It is recommended to do some round robin between the nodes you hit just to get better network behavior.

-shay.banon
On Wednesday, May 18, 2011 at 3:15 PM, maho wrote:
hi,

  1. Does Elasticsearch use a Master-/Slave Replication or a Multi-
    Master Replication?

  2. If Elasticsearch uses Master-/Slave Replication: How can i know
    which instance is the master?

  3. Has Elasticsearch an integrated load balancer - If yes, does it
    matter which instance i use for queries?

Thanks.


(system) #3