Network latency while adding new replica

Tarun_Jangra · October 5, 2011, 11:37am

I am confused with this. Elastic search is following Push replication algorithm instead of having "Pull replication algorithm". Where chunk of data are suppose to be pushed to replicas. if i have say 5 nodes, And i get a situation where i need to add more nodes to increase resources. Than obviously Shards are suppose to be replicated on all new nodes. But if shards have documents in millions than what happened with the data while coping shard to new replicas. Because system is available during this copying process. How it will manage to have updates in shard happened during this copying process? And time of availability of new replicas is actually depend over size of shard being copied? So could it vary accordingly?

kimchy · October 6, 2011, 7:10pm

Yes, the time it will take to move shards around depends on the size of
them. It allows for indexing operation to still occur because it uses a
transaction log, which is used to make sure that the copy process can take
place while indexing is still happening (through replaying it against hte
copy when needed).

On Wed, Oct 5, 2011 at 1:37 PM, tarun.jangra tarun@izap.in wrote:

I am confused with this. Elastic search is following Push replication
algorithm instead of having "Pull replication algorithm". Where chunk of
data are suppose to be pushed to replicas. if i have say 5 nodes, And i get
a situation where i need to add more nodes to increase resources. Than
obviously Shards are suppose to be replicated on all new nodes. But if
shards have documents in millions than what happened with the data while
coping shard to new replicas. Because system is available during this
copying process. How it will manage to have updates in shard happened
during
this copying process? And time of availability of new replicas is actually
depend over size of shard being copied? So could it vary accordingly?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Network-latency-while-adding-new-replica-tp3396237p3396237.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

Tarun_Jangra · October 6, 2011, 8:23pm

it means there is no chance of data loss unless whole cluster is down.

kimchy · October 6, 2011, 8:32pm

Thats a different question then moving shards around, and you won't loose
data even if the whole cluster is down as long as you bring it back up using
the same data location.

On Thu, Oct 6, 2011 at 10:23 PM, Tarun tarun@izap.in wrote:

it means there is no chance of data loss unless whole cluster is down.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Network-latency-while-adding-new-replica-tp3396237p3400951.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

Tarun_Jangra · October 6, 2011, 8:45pm

hi kimchy,
thanks for your prompt reply. suppose i need highly available system for 10 million documents. than what is the best combinations of shards and replicas over amazon cloud and why?

kimchy · October 8, 2011, 6:36pm

The number of shards really depends on the document structure, size of it,
number of fields... . You will need to test a bit. As for number of
replicas, they control how highly available you want it to be. With
number_of_replicas set to 1, you will have two copies of your data.

On Thu, Oct 6, 2011 at 10:45 PM, Tarun tarun@izap.in wrote:

hi kimchy,
thanks for your prompt reply. suppose i need highly available system for 10
million documents. than what is the best combinations of shards and
replicas
over amazon cloud and why?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Network-latency-while-adding-new-replica-tp3396237p3401023.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

Topic		Replies	Views
How replica node works? Elasticsearch	6	468	July 6, 2017
Data distribution over shards and replicas Elasticsearch	6	1103	July 6, 2017
Newbie question on shard and replicas Elasticsearch	5	434	July 6, 2017
Replicating all data to a single node Elasticsearch	9	1820	July 6, 2017
Few queries on setting up a high performing and scalable ES setup Elasticsearch	3	356	July 6, 2017

Network latency while adding new replica

Related topics