Read/Write consistency

Trying to understand the following scenarios of consistency in
elasticsearch:

  1. sync replication - How does elasticsearch deals with consistency issue
    that may arise from 1 node momentarily going down and missing writes to it?
    When the node comes backup and the reads going to the non-primary shards
    could get inconsistent data?
  2. async replication - What happens if replication is slow for some reason,
    could users see inconsistent data?
  3. sync/async replication - how does elasticsearch keep data in sync for
    those writes that never happened on the non-primary shard because of
    network/node failures?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrpUEpVU2yg3km_v%3DtuA0duSiFV5HYnPyeCztdmrTcMsA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Could somebody help get some insights on this topic?

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia mohitanchlia@gmail.comwrote:

Trying to understand the following scenarios of consistency in
elasticsearch:

  1. sync replication - How does elasticsearch deals with consistency issue
    that may arise from 1 node momentarily going down and missing writes to it?
    When the node comes backup and the reads going to the non-primary shards
    could get inconsistent data?
  2. async replication - What happens if replication is slow for some
    reason, could users see inconsistent data?
  3. sync/async replication - how does elasticsearch keep data in sync for
    those writes that never happened on the non-primary shard because of
    network/node failures?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrv_zkCQ26dk9Ey41zckaix9QZWP6ObUx4dsYp0p99Bgg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia mohitanchlia@gmail.comwrote:

Trying to understand the following scenarios of consistency in
elasticsearch:

  1. sync replication - How does elasticsearch deals with consistency issue
    that may arise from 1 node momentarily going down and missing writes to it?

This depends on the write consistency setting. By default, the operation
only succeeds if a quorum of replicas can index the document:

When the node comes backup and the reads going to the non-primary shards

could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with the
other nodes.

  1. async replication - What happens if replication is slow for some

reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could see
an "old" version of the data. You can use "preference" to try and hit the
primary shard all the time, but then your replicas will just be sitting
there for redundancy:

  1. sync/async replication - how does elasticsearch keep data in sync for

those writes that never happened on the non-primary shard because of
network/node failures?

It either uses the transaction log or it transfers the whole shard to that
node.

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

What's not clear is how does elasticsearch identify what pieces of data is
missing between the primary and the replica?

On Wed, Apr 30, 2014 at 3:27 AM, Radu Gheorghe
radu.gheorghe@sematext.comwrote:

Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia mohitanchlia@gmail.comwrote:

Trying to understand the following scenarios of consistency in
elasticsearch:

  1. sync replication - How does elasticsearch deals with consistency
    issue that may arise from 1 node momentarily going down and missing writes
    to it?

This depends on the write consistency setting. By default, the operation
only succeeds if a quorum of replicas can index the document:

Elasticsearch Platform — Find real-time answers at scale | Elastic

When the node comes backup and the reads going to the non-primary shards

could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with the
other nodes.

  1. async replication - What happens if replication is slow for some

reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could see
an "old" version of the data. You can use "preference" to try and hit the
primary shard all the time, but then your replicas will just be sitting
there for redundancy:

Elasticsearch Platform — Find real-time answers at scale | Elastic

  1. sync/async replication - how does elasticsearch keep data in sync for

those writes that never happened on the non-primary shard because of
network/node failures?

It either uses the transaction log or it transfers the whole shard to that
node.

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Mohit,

I think the transaction log takes care of that, because there's a copy on
all instances of the same shard, and they need to be in sync.

Best regards,
Radu

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thu, May 1, 2014 at 9:57 PM, Mohit Anchlia mohitanchlia@gmail.comwrote:

What's not clear is how does elasticsearch identify what pieces of data is
missing between the primary and the replica?

On Wed, Apr 30, 2014 at 3:27 AM, Radu Gheorghe <radu.gheorghe@sematext.com

wrote:

Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia mohitanchlia@gmail.comwrote:

Trying to understand the following scenarios of consistency in
elasticsearch:

  1. sync replication - How does elasticsearch deals with consistency
    issue that may arise from 1 node momentarily going down and missing writes
    to it?

This depends on the write consistency setting. By default, the operation
only succeeds if a quorum of replicas can index the document:

Elasticsearch Platform — Find real-time answers at scale | Elastic

When the node comes backup and the reads going to the non-primary

shards could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with the
other nodes.

  1. async replication - What happens if replication is slow for some

reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could
see an "old" version of the data. You can use "preference" to try and hit
the primary shard all the time, but then your replicas will just be sitting
there for redundancy:

Elasticsearch Platform — Find real-time answers at scale | Elastic

  1. sync/async replication - how does elasticsearch keep data in sync for

those writes that never happened on the non-primary shard because of
network/node failures?

It either uses the transaction log or it transfers the whole shard to
that node.

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3FAEvQGjMWDqSCT6biYJGiMNGSUDJ80QvT1cJXnqtNJg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Is there a documentation on that? From what I've read it is local to the
node.

On Thu, May 1, 2014 at 11:57 PM, Radu Gheorghe
radu.gheorghe@sematext.comwrote:

Hi Mohit,

I think the transaction log takes care of that, because there's a copy on
all instances of the same shard, and they need to be in sync.

Best regards,
Radu

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thu, May 1, 2014 at 9:57 PM, Mohit Anchlia mohitanchlia@gmail.comwrote:

What's not clear is how does elasticsearch identify what pieces of data
is missing between the primary and the replica?

On Wed, Apr 30, 2014 at 3:27 AM, Radu Gheorghe <
radu.gheorghe@sematext.com> wrote:

Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia mohitanchlia@gmail.comwrote:

Trying to understand the following scenarios of consistency in
elasticsearch:

  1. sync replication - How does elasticsearch deals with consistency
    issue that may arise from 1 node momentarily going down and missing writes
    to it?

This depends on the write consistency setting. By default, the operation
only succeeds if a quorum of replicas can index the document:

Elasticsearch Platform — Find real-time answers at scale | Elastic

When the node comes backup and the reads going to the non-primary

shards could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with
the other nodes.

  1. async replication - What happens if replication is slow for some

reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could
see an "old" version of the data. You can use "preference" to try and hit
the primary shard all the time, but then your replicas will just be sitting
there for redundancy:

Elasticsearch Platform — Find real-time answers at scale | Elastic

  1. sync/async replication - how does elasticsearch keep data in sync

for those writes that never happened on the non-primary shard because of
network/node failures?

It either uses the transaction log or it transfers the whole shard to
that node.

Best regards,
Radu

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3FAEvQGjMWDqSCT6biYJGiMNGSUDJ80QvT1cJXnqtNJg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAHXA0_3FAEvQGjMWDqSCT6biYJGiMNGSUDJ80QvT1cJXnqtNJg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWphfOYz%3DkFTH-NR6GeAna1oe3kq1je2Dz4iesePAS%3D%2BMA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.