[QUESTIONS] - failover mechanisms and consistency

Nicolas_Harraudeau · February 17, 2015, 9:57pm

I have been searching the documentation, stackoverflow and this google
group for some answers about GET API and consistency but there are still
some details I am not sure to understand correctly.

Here are the things I think I understood, don’t hesitate to correct me if I
am wrong:

a GET immediately following a PUT (with sync enabled) will always return
the same document thanks to the transaction log. This is true even if the
GET has the default ”random” preference. (we suppose no other process write
at the same time)
even with a QUORUM consistency, a write operation in “sync” mode will
always send the new doc to ALL replica and wait for their answers. QUORUM
only changes the number of successful replications needed.
if a replica is down and comes back, it will have to synchronise with
the other nodes before having the right to answer requests.

What I don’t understand is the “how” this all works in case of a short node
failure.

Let’s take a simplified example:

3 nodes: A, B and C
1 shard with node A as primary node, B and C being replica
1 single threaded client

the client PUT a doc in sync mode and QUORUM consistency.
the request is redirected to node A where it is written.
the doc is replicated to node B.
node C does not respond and fails to replicate (due for example to
garbage collection)
as quorum is satisfied A returns a success
garbage collector finishes its job on node C. It can be contacted again.
Once the answer from node A is received the client performs a GET of the
document with default (random) preference

Here are the questions:

what happens between steps 4 and 5? Is node C unallocated immediately,
before answering to the write request?
what happens between steps 6 and 7? The problem was very short and node C
did not stop. Is it possible that node C does not realise it failed some
requests and continue to answer client requests?
Do official client library detect that a node has been unallocated before
sending a request?
What happens if a client does not check unallocated nodes in step 7 and
sends the GET request directly to node C?
What happens if in step 7 the client sends the GET request to node B (not
the primary one)? Does it know that B has been unallocated? if not, can the
request be redirected to node C (as the preference is random)?
What happens if in step 7 the client sends the GET request to node A
(primary shard)? (just to be sure)

I have been using Elasticsearch for a few months now and you guys have done
a really great job. Thank you for your hard work. I have not experienced
the problems I described here, those are just scary things I imagined after
reading the doc. Maybe these corner cases have already been explained. If
so, I apologise.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7f422255-3211-453c-891a-157349b54950%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nicolas_Harraudeau · February 18, 2015, 12:27pm

The reason I am concerned is this sentence in an ongoing issue description
found
at http://www.elasticsearch.org/guide/en/elasticsearch/resiliency/current/ :
"If a network partition separates a node from the master, there is some
window of time before the node detects it. This window is extremely small
if a socket is broken. More adversarial partitions, for example, silently
dropping requests without breaking the socket can take longer (up to 3x30s
using current defaults)"

The client I use is elasticsearch.js. Its documentation says that it will
round-robin the requests on its connections and I would like to know if it
can successfully send the GET request to an unallocated node.
If this can happen, what are the recommendations to prevent this situation?
sending request only to non-data nodes?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4554281c-622e-48e2-b2f5-aed54c2b4670%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · February 18, 2015, 10:21pm

1 - yes, if you are getting by the document ID
2 - yes, see

3 - yes

Between step 6 and 7 the other replica will be updated.

On 18 February 2015 at 08:57, Nicolas Harraudeau <
nicolas.harraudeau@gmail.com> wrote:

I have been searching the documentation, stackoverflow and this google
group for some answers about GET API and consistency but there are still
some details I am not sure to understand correctly.

Here are the things I think I understood, don’t hesitate to correct me if
I am wrong:

a GET immediately following a PUT (with sync enabled) will always
return the same document thanks to the transaction log. This is true even
if the GET has the default ”random” preference. (we suppose no other
process write at the same time)

even with a QUORUM consistency, a write operation in “sync” mode will
always send the new doc to ALL replica and wait for their answers. QUORUM
only changes the number of successful replications needed.

if a replica is down and comes back, it will have to synchronise with
the other nodes before having the right to answer requests.

What I don’t understand is the “how” this all works in case of a short
node failure.

Let’s take a simplified example:

3 nodes: A, B and C

1 shard with node A as primary node, B and C being replica

1 single threaded client

the client PUT a doc in sync mode and QUORUM consistency.

the request is redirected to node A where it is written.

the doc is replicated to node B.

node C does not respond and fails to replicate (due for example to
garbage collection)

as quorum is satisfied A returns a success

garbage collector finishes its job on node C. It can be contacted again.

Once the answer from node A is received the client performs a GET of
the document with default (random) preference

Here are the questions:

what happens between steps 4 and 5? Is node C unallocated immediately,
before answering to the write request?

what happens between steps 6 and 7? The problem was very short and node
C did not stop. Is it possible that node C does not realise it failed some
requests and continue to answer client requests?

Do official client library detect that a node has been unallocated
before sending a request?

What happens if a client does not check unallocated nodes in step 7 and
sends the GET request directly to node C?

What happens if in step 7 the client sends the GET request to node B
(not the primary one)? Does it know that B has been unallocated? if not,
can the request be redirected to node C (as the preference is random)?

What happens if in step 7 the client sends the GET request to node A
(primary shard)? (just to be sure)

I have been using Elasticsearch for a few months now and you guys have
done a really great job. Thank you for your hard work. I have not
experienced the problems I described here, those are just scary things I
imagined after reading the doc. Maybe these corner cases have already been
explained. If so, I apologise.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7f422255-3211-453c-891a-157349b54950%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/7f422255-3211-453c-891a-157349b54950%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9ETMkPenqw7UkwZ5WnaFkb25EfMCWqnvuyx%2Bm-%2BWEEhw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Nicolas_Harraudeau · February 18, 2015, 10:39pm

Thank you, simple answers are the best!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2450588d-23a5-464b-ba77-459b3fad268c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Ha Elasticsearch	8	451	July 6, 2017
Read/Write consistency Elasticsearch	6	3608	July 6, 2017
Elastic Search and consistency Elasticsearch	7	6981	July 6, 2017
Question on read consistency Elasticsearch	10	4328	July 6, 2017
Load Balancing when Node got Down Elasticsearch	9	2402	July 6, 2017

[QUESTIONS] - failover mechanisms and consistency

Related topics