On Saturday, January 15, 2011 at 10:35 PM, barak wrote:
Doing my first steps in ES, I've few questions:
- In case the cluster is composed of N nodes, is data split equally
on the nodes?
The aim of the cluster is to get an even number of shards allocated on each node.
- In case node is crashed, is its data backuped on the other nodes
so no data loss in case of such crash? And if the client uses this
node for queries, will it still get answers, or be notified that error
Each shard can have one or more replicas. If a node crashes, the replicas will consist of its backup, so no data is lost, and the shards allocated on that node will get reallocated on the rest of the nodes.
If a client uses that node to query, and that node crashes, then you need to use another node to query. If you use HTTP with the REST API, then you can simply round robin between servers.
- In case master node is crashed, is the cluster still functioning?
Yes, another node will be elected as master.
- In case a new node joins the cluster, how much time takes the
cluster to re-balance the data (say 1B docs, 4 nodes cluster)?
Depends on your network. There is no reindexing being done, just moving data around (shards).