How replica node works?


(Gianluca Bassini) #1

I'm curious about elasticsearch cluster architecture and I didn't find any
documentation about it.

In particulary I'm interested about how replica nodes works, replica node
receive operation log from master and performe the same operation (like in
mongodb replica set) or the replica copy the delta chunk from the master
like solr

Thanks in advance
Gianluca

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04365499-8ea4-4094-9a2d-e4909a76e45f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

It depends.

When you have no replica allocated (default with only one node), then replica is first copied over the network and then transaction log is replayed for remaining operations.
When the replica is allocated, each operation (transaction log) is replayed on each replica.

About terminology, we don't speak about "replica nodes" but "replica shards". Index is split into shards. Shards are allocated on nodes. A shard can be a primary or a replica. So on a given node, you can have primary shards or replica shards. It does not really matter.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 15 janvier 2014 at 16:54:59, Gianluca Bassini (koddio@gmail.com) a écrit:

I'm curious about elasticsearch cluster architecture and I didn't find any documentation about it.

In particulary I'm interested about how replica nodes works, replica node receive operation log from master and performe the same operation (like in mongodb replica set) or the replica copy the delta chunk from the master like solr

Thanks in advance
Gianluca

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04365499-8ea4-4094-9a2d-e4909a76e45f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52d6b979.140e0f76.11afa%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(Nitesh Earkara) #3

Hi,

I have some basic understanding that shards are distributed across nodes
and there will be replicas for each shard (assuming replica are set up for
each shard). Also that shard and its replica will never exist on same node.

I have few questions to clear my concepts on shards and replicas

  1. How does ES divide the data across shards? How does it divide what data/
    how much data should it go to each shard?
  2. Does one shard know what data is present in other shard?
  3. If new data gets added to an index/type,how does it decide into which
    shard the data should go? Does this happen instantaneously or some indexing
    or crawling occur at regular intervals after which data gets added to the
    shard?
  4. How data is synchronized between replica and shard? If data is added to
    shard, how long will it take for it to appear in replica?

On Wednesday, January 15, 2014 10:08:17 PM UTC+5:30, David Pilato wrote:

It depends.

When you have no replica allocated (default with only one node), then
replica is first copied over the network and then transaction log is
replayed for remaining operations.
When the replica is allocated, each operation (transaction log) is
replayed on each replica.

About terminology, we don't speak about "replica nodes" but "replica
shards". Index is split into shards. Shards are allocated on nodes. A shard
can be a primary or a replica. So on a given node, you can have primary
shards or replica shards. It does not really matter.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 15 janvier 2014 at 16:54:59, Gianluca Bassini (kod...@gmail.com<javascript:>)
a écrit:

I'm curious about elasticsearch cluster architecture and I didn't find any
documentation about it.

In particulary I'm interested about how replica nodes works, replica node
receive operation log from master and performe the same operation (like in
mongodb replica set) or the replica copy the delta chunk from the master
like solr

Thanks in advance
Gianluca

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/04365499-8ea4-4094-9a2d-e4909a76e45f%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4fd01399-5055-4883-b2c7-551b4770888a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #4

Answered inline.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 mars 2014 à 06:50, Nitesh Earkara enitesh@gmail.com a écrit :

Hi,

I have some basic understanding that shards are distributed across nodes and there will be replicas for each shard (assuming replica are set up for each shard). Also that shard and its replica will never exist on same node.

I have few questions to clear my concepts on shards and replicas

  1. How does ES divide the data across shards? How does it divide what data/ how much data should it go to each shard?
    Using a routing value (default is _id) which is hashed and we compute a modulo based on #of shards.
  1. Does one shard know what data is present in other shard?
    No.
  1. If new data gets added to an index/type,how does it decide into which shard the data should go? Does this happen instantaneously or some indexing or crawling occur at regular intervals after which data gets added to the shard?
    I explained it with answer 1.
    It's at index time.
  1. How data is synchronized between replica and shard? If data is added to shard, how long will it take for it to appear in replica?
    Immediatly by defaut. When you get the response back, you know that your doc is on every shard it should be.

On Wednesday, January 15, 2014 10:08:17 PM UTC+5:30, David Pilato wrote:
It depends.

When you have no replica allocated (default with only one node), then replica is first copied over the network and then transaction log is replayed for remaining operations.
When the replica is allocated, each operation (transaction log) is replayed on each replica.

About terminology, we don't speak about "replica nodes" but "replica shards". Index is split into shards. Shards are allocated on nodes. A shard can be a primary or a replica. So on a given node, you can have primary shards or replica shards. It does not really matter.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 15 janvier 2014 at 16:54:59, Gianluca Bassini (kod...@gmail.com) a écrit:

I'm curious about elasticsearch cluster architecture and I didn't find any documentation about it.

In particulary I'm interested about how replica nodes works, replica node receive operation log from master and performe the same operation (like in mongodb replica set) or the replica copy the delta chunk from the master like solr

Thanks in advance
Gianluca

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04365499-8ea4-4094-9a2d-e4909a76e45f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4fd01399-5055-4883-b2c7-551b4770888a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/746A55B4-5730-4765-9903-7B53A2EE60EB%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


(Nitesh Earkara) #5

Thanks David.

Is there any command to check what data is present in each shard?

On 13 March 2014 11:27, David Pilato david@pilato.fr wrote:

Answered inline.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 mars 2014 à 06:50, Nitesh Earkara enitesh@gmail.com a écrit :

Hi,

I have some basic understanding that shards are distributed across nodes
and there will be replicas for each shard (assuming replica are set up for
each shard). Also that shard and its replica will never exist on same node.

I have few questions to clear my concepts on shards and replicas

  1. How does ES divide the data across shards? How does it divide what
    data/ how much data should it go to each shard?

Using a routing value (default is _id) which is hashed and we compute a
modulo based on #of shards.

  1. Does one shard know what data is present in other shard?

No.

  1. If new data gets added to an index/type,how does it decide into which
    shard the data should go? Does this happen instantaneously or some indexing
    or crawling occur at regular intervals after which data gets added to the
    shard?

I explained it with answer 1.
It's at index time.

  1. How data is synchronized between replica and shard? If data is added to
    shard, how long will it take for it to appear in replica?

Immediatly by defaut. When you get the response back, you know that your
doc is on every shard it should be.

On Wednesday, January 15, 2014 10:08:17 PM UTC+5:30, David Pilato wrote:

It depends.

When you have no replica allocated (default with only one node), then
replica is first copied over the network and then transaction log is
replayed for remaining operations.
When the replica is allocated, each operation (transaction log) is
replayed on each replica.

About terminology, we don't speak about "replica nodes" but "replica
shards". Index is split into shards. Shards are allocated on nodes. A shard
can be a primary or a replica. So on a given node, you can have primary
shards or replica shards. It does not really matter.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com

@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 15 janvier 2014 at 16:54:59, Gianluca Bassini (kod...@gmail.com) a
écrit:

I'm curious about elasticsearch cluster architecture and I didn't find
any documentation about it.

In particulary I'm interested about how replica nodes works, replica node
receive operation log from master and performe the same operation (like in
mongodb replica set) or the replica copy the delta chunk from the master
like solr

Thanks in advance
Gianluca

You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/04365499-8ea4-4094-9a2d-e4909a76e45f%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4fd01399-5055-4883-b2c7-551b4770888a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4fd01399-5055-4883-b2c7-551b4770888a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XoAHAgK_G8g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/746A55B4-5730-4765-9903-7B53A2EE60EB%40pilato.frhttps://groups.google.com/d/msgid/elasticsearch/746A55B4-5730-4765-9903-7B53A2EE60EB%40pilato.fr?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Thanks & Regards,
Nithesh Erakkara
enitesh@gmail.com
+919833742684

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOkk-nYArZR01cwZ%2B2tmYq9LfF9vhROOQo3Uvo%3DjRkDbGx9xtQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Pilato) #6

You can search using _routing and give a document id as the routing key.

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 13 mars 2014 à 07:21:33, Nitesh Earkara (enitesh@gmail.com) a écrit:

Thanks David.

Is there any command to check what data is present in each shard?

On 13 March 2014 11:27, David Pilato david@pilato.fr wrote:
Answered inline.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 13 mars 2014 à 06:50, Nitesh Earkara enitesh@gmail.com a écrit :

Hi,

I have some basic understanding that shards are distributed across nodes and there will be replicas for each shard (assuming replica are set up for each shard). Also that shard and its replica will never exist on same node.

I have few questions to clear my concepts on shards and replicas

  1. How does ES divide the data across shards? How does it divide what data/ how much data should it go to each shard?
    Using a routing value (default is _id) which is hashed and we compute a modulo based on #of shards.

  2. Does one shard know what data is present in other shard?
    No.

  3. If new data gets added to an index/type,how does it decide into which shard the data should go? Does this happen instantaneously or some indexing or crawling occur at regular intervals after which data gets added to the shard?
    I explained it with answer 1.
    It's at index time.

  4. How data is synchronized between replica and shard? If data is added to shard, how long will it take for it to appear in replica?
    Immediatly by defaut. When you get the response back, you know that your doc is on every shard it should be.

On Wednesday, January 15, 2014 10:08:17 PM UTC+5:30, David Pilato wrote:
It depends.

When you have no replica allocated (default with only one node), then replica is first copied over the network and then transaction log is replayed for remaining operations.
When the replica is allocated, each operation (transaction log) is replayed on each replica.

About terminology, we don't speak about "replica nodes" but "replica shards". Index is split into shards. Shards are allocated on nodes. A shard can be a primary or a replica. So on a given node, you can have primary shards or replica shards. It does not really matter.

Makes sense?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 15 janvier 2014 at 16:54:59, Gianluca Bassini (kod...@gmail.com) a écrit:

I'm curious about elasticsearch cluster architecture and I didn't find any documentation about it.

In particulary I'm interested about how replica nodes works, replica node receive operation log from master and performe the same operation (like in mongodb replica set) or the replica copy the delta chunk from the master like solr

Thanks in advance
Gianluca

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04365499-8ea4-4094-9a2d-e4909a76e45f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4fd01399-5055-4883-b2c7-551b4770888a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/XoAHAgK_G8g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/746A55B4-5730-4765-9903-7B53A2EE60EB%40pilato.fr.

For more options, visit https://groups.google.com/d/optout.

--
Thanks & Regards,
Nithesh Erakkara
enitesh@gmail.com
+919833742684

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOkk-nYArZR01cwZ%2B2tmYq9LfF9vhROOQo3Uvo%3DjRkDbGx9xtQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53215e33.6b8b4567.158d%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


(system) #7