Abhijeet Rastogi wrote:
Thanks for your reply. Is there a way to avoid this?
It depends on how the shards' data have changed. Currently ES looks
for Lucene segment divergence, which for a large index could mean
that it doesn't have to resync the bigger segments. It's very likely
that it will have to resync most of the shard though, especially
after a big merge.
In your case, since your node crashed, if an index was being written
it might have been corrupted anyway and needed to be resynced.
Also, I noticed that once all the 4 primary shards for a index were
in one node, the restart didn't mess anything up. If that's the
case, then why aren't primary shards being distributed again?
ES only strives for distributing unique copies of the data, not
distributing primaries. Using the cluster reroute API[1], however, you
can cancel allocation of a primary which will effectively swap it
with the replica.
Here's a quick example. I like to use es2unix[2] for visualizing
shard distribution. The json api makes it too difficult to see
what's going on. You could also use a browser tool, but that's hard
to paste in email.
I have a index "wiki" with five shards and one replica:
% es shards wik
wiki 0 r STARTED 20019 218.9mb 229571296 127.0.0.1 Scanner
wiki 0 p STARTED 20019 208.4mb 218617666 127.0.0.1 Lucas Brand
wiki 1 p STARTED 19898 210.3mb 220577145 127.0.0.1 Raza
wiki 1 r STARTED 19898 208.4mb 218612909 127.0.0.1 Lucas Brand
wiki 2 r STARTED 19985 215.5mb 226006668 127.0.0.1 Scanner
wiki 2 p STARTED 19985 221mb 231736530 127.0.0.1 Lucas Brand
wiki 3 p STARTED 20034 222.9mb 233803424 127.0.0.1 Scanner
wiki 3 r STARTED 20034 220.5mb 231221871 127.0.0.1 Raza
wiki 4 p STARTED 20064 222.7mb 233578869 127.0.0.1 Raza
wiki 4 r STARTED 20064 214.3mb 224810852 127.0.0.1 Lucas Brand
Narrowing down to the 0-th shard:
% es shards | grep ^wiki\ 0
wiki 0 r STARTED 20019 218.9mb 229571296 127.0.0.1 Scanner
wiki 0 p STARTED 20019 208.4mb 218617666 127.0.0.1 Lucas Brand
The primary is on Lucas Brand (the third col of output; use --verbose
for the column names). I can cancel its allocation there with:
curl -s -XPOST localhost:9200/_cluster/reroute -d '{
"commands" : [
{
"cancel" : {
"allow_primary" : true,
"index" : "wiki",
"shard" : 0,
"node" : "Lucas Brand"
}
}
]
}'
Which will return a new cluster state routing output (supply
dry_run
if you only want to see that without really changing the
cluster).
Now shard 0 looks like:
% es shards | grep ^wiki\ 0
wiki 0 p STARTED 20019 218.9mb 229571296 127.0.0.1 Scanner
wiki 0 r STARTED 20019 208.4mb 218617666 127.0.0.1 Lucas Brand
ES canceled the primary on Lucas Brand, looked around for a replica
to use as the primary, and picked Scanner. Note that if you have
more than one replica, ES will pick one for you.
Also, is there a way to disable all this so that I can survive restarts
without this reshuffling?
Don't fear the reshuffling. You can do some tweaking with the
cluster allocation config options[3], but I would suggest in this
case not to worry about it. When you restart a node, ES should
quickly get to a yellow health level where you can search and index
while reallocation happens in the background.
It's usually better to let ES allocate and resync data for you. You
can waste a lot of time fixing something that's not an actual
problem.
-Drew
Footnotes:
[1] Elasticsearch Platform — Find real-time answers at scale | Elastic
[2] GitHub - elastic/es2unix: Command-line ES
[3] Elasticsearch Platform — Find real-time answers at scale | Elastic
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.