Primary shard rebalancing


(Kellan) #1

I have a 2-node (1 process on each node) ES cluster setup with 2
shards and 1 replica per shard. With this configuration, I would think
that the ideal balance would be 1 primary shard and 1 replica shard on
each node, and indeed after the initial data insert, this is the case.
However, after one or both processes are restarted, the cluster seems
to "rebalance" itself with both primary shards on one node and both
replicas on the other. Is there a way to direct the cluster back
toward the 1 primary/1 replica per node configuration? Is it correct
that all updates go to the primary shard? My index configuration is
below.

Thanks for any help you can provide,
Kellan

index:
number_of_shards: 2
number_of_replicas: 1

bootstrap.mlockall: true

cluster.name: shardtest

network.host: ip1

http.port: 9200

transport.port: 9400

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [ip1, ip2]
discovery.zen.minimum_master_nodes: 2


(Shay Banon) #2

There is no meaning to have balanced primary allocation since primary and
replica shards do the same work.

On Fri, Dec 16, 2011 at 12:34 AM, Kellan wampleek@gmail.com wrote:

I have a 2-node (1 process on each node) ES cluster setup with 2
shards and 1 replica per shard. With this configuration, I would think
that the ideal balance would be 1 primary shard and 1 replica shard on
each node, and indeed after the initial data insert, this is the case.
However, after one or both processes are restarted, the cluster seems
to "rebalance" itself with both primary shards on one node and both
replicas on the other. Is there a way to direct the cluster back
toward the 1 primary/1 replica per node configuration? Is it correct
that all updates go to the primary shard? My index configuration is
below.

Thanks for any help you can provide,
Kellan

index:
number_of_shards: 2
number_of_replicas: 1

bootstrap.mlockall: true

cluster.name: shardtest

network.host: ip1

http.port: 9200

transport.port: 9400

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [ip1, ip2]
discovery.zen.minimum_master_nodes: 2


(Kellan) #3

Can replica shards handle data inserts? I thought the primary shard
handled all data inserts and reindexing.

On Dec 16, 10:48 am, Shay Banon kim...@gmail.com wrote:

There is no meaning to have balanced primary allocation since primary and
replica shards do the same work.

On Fri, Dec 16, 2011 at 12:34 AM, Kellan wampl...@gmail.com wrote:

I have a 2-node (1 process on each node) ES cluster setup with 2
shards and 1 replica per shard. With this configuration, I would think
that the ideal balance would be 1 primary shard and 1 replica shard on
each node, and indeed after the initial data insert, this is the case.
However, after one or both processes are restarted, the cluster seems
to "rebalance" itself with both primary shards on one node and both
replicas on the other. Is there a way to direct the cluster back
toward the 1 primary/1 replica per node configuration? Is it correct
that all updates go to the primary shard? My index configuration is
below.

Thanks for any help you can provide,
Kellan

index:
number_of_shards: 2
number_of_replicas: 1

bootstrap.mlockall: true

cluster.name: shardtest

network.host: ip1

http.port: 9200

transport.port: 9400

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [ip1, ip2]
discovery.zen.minimum_master_nodes: 2


(Shay Banon) #4

Yes, replicas also handle indexing in order to provide (near) realtime
support search and HA.

On Fri, Dec 16, 2011 at 9:44 PM, Kellan wampleek@gmail.com wrote:

Can replica shards handle data inserts? I thought the primary shard
handled all data inserts and reindexing.

On Dec 16, 10:48 am, Shay Banon kim...@gmail.com wrote:

There is no meaning to have balanced primary allocation since primary and
replica shards do the same work.

On Fri, Dec 16, 2011 at 12:34 AM, Kellan wampl...@gmail.com wrote:

I have a 2-node (1 process on each node) ES cluster setup with 2
shards and 1 replica per shard. With this configuration, I would think
that the ideal balance would be 1 primary shard and 1 replica shard on
each node, and indeed after the initial data insert, this is the case.
However, after one or both processes are restarted, the cluster seems
to "rebalance" itself with both primary shards on one node and both
replicas on the other. Is there a way to direct the cluster back
toward the 1 primary/1 replica per node configuration? Is it correct
that all updates go to the primary shard? My index configuration is
below.

Thanks for any help you can provide,
Kellan

index:
number_of_shards: 2
number_of_replicas: 1

bootstrap.mlockall: true

cluster.name: shardtest

network.host: ip1

http.port: 9200

transport.port: 9400

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [ip1, ip2]
discovery.zen.minimum_master_nodes: 2


(Lukáš Vlček) #5

Kellan,

the data is first indexed on primary shard and the primary shard then makes
sure it is also replicated to all replicas. So if you have two nodes, two
indices each with 1 shard and 1 replica then even if each primary would be
located on different node the indexing would still propagate every document
to both nodes equally.

Regards,
Lukas

On Fri, Dec 16, 2011 at 9:58 PM, Shay Banon kimchy@gmail.com wrote:

Yes, replicas also handle indexing in order to provide (near) realtime
support search and HA.

On Fri, Dec 16, 2011 at 9:44 PM, Kellan wampleek@gmail.com wrote:

Can replica shards handle data inserts? I thought the primary shard
handled all data inserts and reindexing.

On Dec 16, 10:48 am, Shay Banon kim...@gmail.com wrote:

There is no meaning to have balanced primary allocation since primary
and
replica shards do the same work.

On Fri, Dec 16, 2011 at 12:34 AM, Kellan wampl...@gmail.com wrote:

I have a 2-node (1 process on each node) ES cluster setup with 2
shards and 1 replica per shard. With this configuration, I would think
that the ideal balance would be 1 primary shard and 1 replica shard on
each node, and indeed after the initial data insert, this is the case.
However, after one or both processes are restarted, the cluster seems
to "rebalance" itself with both primary shards on one node and both
replicas on the other. Is there a way to direct the cluster back
toward the 1 primary/1 replica per node configuration? Is it correct
that all updates go to the primary shard? My index configuration is
below.

Thanks for any help you can provide,
Kellan

index:
number_of_shards: 2
number_of_replicas: 1

bootstrap.mlockall: true

cluster.name: shardtest

network.host: ip1

http.port: 9200

transport.port: 9400

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [ip1, ip2]
discovery.zen.minimum_master_nodes: 2


(system) #6