Multiple Primaries of same index on same node in a cluster. Can I make one move?

I have 4 nodes in my cluster. All settings are default. There is no
extra shard allocation definitions.
I have an index with 3 shards +2 replicas each, for a total of 9.
I turned off one of the nodes and all were equally distributed, 1
primary and 2 replicas on each. All is well.
I turn off one more. As expected 2 primaries are now on one node.

I bring both of the missing two nodes back on-line.

  • number_of_nodes: 4
  • number_of_data_nodes: 4
  • active_primary_shards: 3
  • active_shards: 9

Everything is back as expected, but ...

From _plugin/Head Cluster State or just the URI /_cluster/state, I can
see that I have 2 primaries on one node otherwise everything is
distributed well.
(See listing below)
I was thinking that reallocation would eventually distribute the primary
shards around, not leaving two on the same node.
I thought this because I thought primaries do all the work when
querying, so having two on the same node would make that node work
harder while 2 others sit around waiting only for inserts.
Is this a correct description of searching?
Should I care, if multiple primaries are on the same node?
If so, can I make them move through some cluster update or other method?
The following is a condensed version of the routing section of
http:/.../_cluster/state?pretty=true. Note the 2 nodes with only 2
replicas and 1 node with 2 primaries and 1 replica.

-Paul

"routing_nodes" : {
"unassigned" : [ ],
"nodes" : {
"dQtKNcrvTJm75TRZF8-6Jg" : [ {
"primary" : true,
"shard" : 0,
}, {
"primary" : false,
"shard" : 1,
}, {
"primary" : true,
"shard" : 2,
} ],
"XYDqhIA7QD-R0EsUkaepaA" : [ {
"primary" : false,
"shard" : 0,
}, {
"primary" : false,
"shard" : 2,
} ],
"SdYrPmJDR7KP43woxLVRYA" : [ {
"primary" : false,
"shard" : 1,
}, {
"primary" : false,
"shard" : 2,
} ],
"gEDcSprISSCMaJcoykUq4Q" : [ {
"primary" : false,
"shard" : 0,
}, {
"primary" : true,
"shard" : 1,
} ]
}
},

I was hoping that reallocation would notice the excess primaries on one
node and move one of them (I don't care which) to one of the other nodes
that now

--

I looked into this issue in the past as well. From what I can tell
from the allocation code, primary/replica has no bearing:

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/EvenShardsCountAllocator.java

From my experiences, queries are shard specific only if you are
explicitly setting the routing for a query, otherwise all nodes will
be queried regardless. Could be a factor of the types of query I am
executing.

Cheers,

Ivan

On Tue, Sep 11, 2012 at 9:53 AM, P. Hill parehill1@gmail.com wrote:

I have 4 nodes in my cluster. All settings are default. There is no extra
shard allocation definitions.
I have an index with 3 shards +2 replicas each, for a total of 9.
I turned off one of the nodes and all were equally distributed, 1 primary
and 2 replicas on each. All is well.
I turn off one more. As expected 2 primaries are now on one node.

I bring both of the missing two nodes back on-line.

  • number_of_nodes: 4
  • number_of_data_nodes: 4
  • active_primary_shards: 3
  • active_shards: 9

Everything is back as expected, but ...

From _plugin/Head Cluster State or just the URI /_cluster/state, I can see
that I have 2 primaries on one node otherwise everything is distributed
well.
(See listing below)
I was thinking that reallocation would eventually distribute the primary
shards around, not leaving two on the same node.
I thought this because I thought primaries do all the work when querying, so
having two on the same node would make that node work harder while 2 others
sit around waiting only for inserts.
Is this a correct description of searching?
Should I care, if multiple primaries are on the same node?
If so, can I make them move through some cluster update or other method?
The following is a condensed version of the routing section of
http:/.../_cluster/state?pretty=true. Note the 2 nodes with only 2 replicas
and 1 node with 2 primaries and 1 replica.

-Paul

"routing_nodes" : {
"unassigned" : ,
"nodes" : {
"dQtKNcrvTJm75TRZF8-6Jg" : [ {
"primary" : true,
"shard" : 0,
}, {
"primary" : false,
"shard" : 1,
}, {
"primary" : true,
"shard" : 2,
} ],
"XYDqhIA7QD-R0EsUkaepaA" : [ {
"primary" : false,
"shard" : 0,
}, {
"primary" : false,
"shard" : 2,
} ],
"SdYrPmJDR7KP43woxLVRYA" : [ {
"primary" : false,
"shard" : 1,
}, {
"primary" : false,
"shard" : 2,
} ],
"gEDcSprISSCMaJcoykUq4Q" : [ {
"primary" : false,
"shard" : 0,
}, {
"primary" : true,
"shard" : 1,
} ]
}
},

I was hoping that reallocation would notice the excess primaries on one node
and move one of them (I don't care which) to one of the other nodes that now

--

--

On 9/11/2012 12:57 PM, Ivan Brusic wrote:

I looked into this issue in the past as well. From what I can tell
from the allocation code, primary/replica has no bearing:
The allocation code is responsible for moving, so to clarify your
comment: "primary/replica has no bearing on [moving] a shard copy.]

"By default, the operation is randomized between the shard replicas."

Search results come back with shard counts that report values that
correspond with querying 1 copy of each shard as would be expected.
Combining that with the other choices on that page, suggests
the"replicas" in the default random method would include both the
primary and replicas.
I supposed the sentence should then be:

"By default, the operation is randomized between all shard [copies
including replicas and primaries.]"

Oh no, now I should cross that bridge and check out the page docs! I'd
be glad to do that.
But I would have to know that the above re-write it correct.

Can anyone confirm that the above sentence re-write would be more accurate?

Going back to my questions:

I thought [about moving primaries] because I thought primaries do all the work when querying, so
having two on the same node would make that node work harder while 2 others
sit around waiting only for inserts.
Is this a correct description of searching?

Answer: No, any copy of a shard services a search request using the default search request routing.
If I used search preferecnce=_primary or _primary_first, then I might experience a larger load on one node with more than its share of primary shards.

Continuing on that page. I can't imagine how the search preference custom works, so I couldn't help re-write the description of custom value.
"A custom value will be used to guarantee that the same shards will be used for the same custom value."
It does NOT say how it is used, or to what any such custom value is compared. For example, it say "shards" how would the same set of shards get associated with a random value? Does that only work, if I send the request to the same cluster node?

Should I care, if multiple primaries are on the same node?

Answer: Not as long as the default random shard method is used.

If so, can I make them move through some cluster update or other method?

All shards are allocated using various|cluster.routing.allocation values, see

But personally I don't yet understand how this could move only a primary shard or prevent two primaries from residing together even if I wanted to.
|

-Paul

--

On Wed, Sep 12, 2012 at 1:19 PM, P. Hill parehill1@gmail.com wrote:

On 9/11/2012 12:57 PM, Ivan Brusic wrote:

The allocation code is responsible for moving, so to clarify your comment:
"primary/replica has no bearing on [moving] a shard copy.]

Correct. That is the point I was attempting to make.

I have been in situations, primarily due to the rebooting of a node
without disabling re-allocation first, where the shard distribution
has been less than ideal (in terms of providing high availability in
case a node went done) and I was not able to find how to distribute
shards.

The documentation regarding behind-the-scenes working could be better
for those that want to fine-tune. Hopefully others more knowledgeable
will chime in.

Sematext has the best writeup for shard allocation so far:

Cheers,

Ivan

--