Java recovery api, stale shard information?


(Bitsof Info) #1

Hi,
I am using ES 1.5.2

For a given index I am trying to determine which ES node(s) hold the
primary shards at any given time.

I am using the Java API

I make a RecoveryRequest for a specific index to a node, and get back
a RecoveryResponse, I then do the following to create a simple list that
only contains nodes which are primary for a given shard. I just switch on
getPrimary then match on nodeId from a previously built map of nodes that I
have.

response.shardResponses.values.foreach(srrList => {
srrList.foreach(shardRecoveryResponse => {

                val recoveryState = shardRecoveryResponse.recoveryState;
                if (recoveryState.getPrimary) {
                    primaryNodes += 

nodeMap(recoveryState.getSourceNode.getId)
}
})
})

The issue is this.

a) I startup my cluster with a test index, 5 shards and only a few
documents. The cluster is 4 nodes. The shards are distributed as such (via
the head plugin) (all green)

node1: s(1)-replica, s(2)-primary, s(3)-primary
node2: s(2)-replica, s(4)-replica
node3: s(0)-replica, s(3)-replica, s(4)-primary
node4: s(0)-primary, s(1)-primary

b) I run my little bit of code like the above and I get back what I would
expect, (node1, node3 and node4) as the only nodes in my list because they
are the only ones w/ primary shards

c) I then shut down node1 (currently holds 2 primary shards)

d) The cluster now balances and looks like this (via the head plugin) (all
green)

node2: s(1)-replica, s(4)-replica, s(2)-primary
node3: s(0)-replica, s(3)-primary, s(4)-primary
node4: s(0)-primary, s(1)-primary, s(2)-replica, s(3)-replica

e) I run my little bit of code again and I DON'T get back what I expect the
data within the RecoveryResponse states primary shard holding nodes are
(node3 and node4). Node2 (according to data within RecoveryResponse) does
not hold any primary shards..... Even after killing my client and
completely re-connecting I get the same response.

f) The only way I can get a correct view of the cluster after a rebalance
is by closing the index, then re-opening it. Once this is done then the
data in RecoveryResponse is correct (matches what head plugin says)

I am using this api wrong? Is this expected?

thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2f3b5492-5340-484e-b751-b909aaea6b08%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2