Interesting, my last post was deleted. I will try it again.
On Monday, January 20, 2014 10:26:37 AM UTC+1, Alexander Reelsen wrote:
maybe I misread your mail, but I am actually not sure, what part exactly is
taking so much time. Is it the pinging of nodes, to finally remove a part
of the cluster? Is it the recovery and copying of data in order to recreate
a working cluster?
I'm not sure what it all does under the hood. But I assume it's only
pinging the nodes and removing them from the cluster.
I don't think it needs to copy data, since one rack has all the data and it
only needs to make a replica primary.
Also, if you have different racks, you could use (I guess you do) rack
based awareness allocation to make sure, all your data is available in case
a rack fails.
We are already doing this. At least one replica of a shard is always on a
node in a different rack.
Example,
cluster.routing.allocation.awareness.force.zone.values: rack1,rack2
cluster.routing.allocation.awareness.attributes: ms_rack
node.ms_rack: rack1
And what is a unhealthy state exactly?
With unhealthy I mean that it shows in the head plugin not correct
information.
Normally it shows for example "esm1 cluster health: green (9, 5)" and under
"Cluster Overview" are all the nodes in the cluster.
When the network connection between the racks is lost, there immediately no
nodes under cluster overview visible. Cluster health is still showing green
and the full number of nodes and shards.
The node count then drops slowly down. Depends on how fast the cluster
detects and removes dead nodes.
Here is now an example when connected to head on esm1 during a network loss.
9:23:24 network loss between rack1 and rack2
9:23:43 esm1 removed esc2 node (head shows no nodes and health as green 8,5
)
9:26:13 esm1 removed esd3 node (head shows no nodes and health as green 7,5
)
9:26:43 esm1 removed esd4 node (head shows no nodes and health as green 6,5
)
9:27:13 esm1 removed esm2 node (head shows no nodes and health as yellow
5,5 )
9:27:13 es head shows cluster health as yellow (5,5) which is correct
(since some copies are not available) and also the 5 remaining nodes again
under cluster overview. 2 data, 2 master, 1 client.
Result is that reading and writing to the cluster stalled for about 4
minutes.
When the log on esm1 shows that it removes esm2, it shows "reason failed to
ping, tried [3] times, each with maximum [5s] timeout".
Which is in my eyes not really true based on the timing when it actually
happens. Because it's 4 minutes later.
Version with which I'm testing is 0.90.10.
cluster.name: MSES1Test1
node.name: "esm1"
node.master: true
node.data: false
path.data: /var/lib/elasticsearch/data
path.logs: /var/log/elasticsearch
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["esm1.ms.lan", "esm2.ms.lan",
"esm3.ms.lan"]
cluster.routing.allocation.awareness.force.zone.values: rack1,rack2
cluster.routing.allocation.awareness.attributes: ms_rack
node.ms_rack: rack1
discovery.zen.ping_timeout: 5s
discovery.zen.fd.ping_timeout: 5s
Thanks
Marco
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/918a89d2-bcdd-4843-bea1-64aa982b8b49%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.