GET Consistency (and Quorum) in ElasticSearch

I am new to ElasticSearch and I am evaluating it for a project.

In ES, Replication can be sync or async. In case of async, the client is
returned success as soon as the document is written to the primary shard.
And then the document is pushed to other replicas asynchronously.

When written asynchronously, how do we ensure that when GET is done,
document is returned even if it has not propagated to all the replicas
(just in case we do a GET immediately after PUTing the document). Because
when we do a GET in ES, the query is forwarded to one of the replicas of
the appropriate shard. Provided we are writing asynchronously, the primary
shard may have the document but the selected replica for doing the GET may
not have received/written the document yet. In Cassandra, we can specify
consistency levels (ONE, QUORUM, ALL) at the time of writes as well as
reads. Is something like that possible for reads in ES?

--

As far as I understand the big picture, when you index a document it's written in the transaction log and then you get a succesful answer from ES.
After, in an asynchronous manner, it's replicated on other nodes and indexed by Lucene.

That said, you can not search immediatly for the document, but you can GET it.
ES will read the tlog if needed when you GET a document.

I think (not sure) that if the replica is not up to date, the GET will be sent on the primary tlog.

Correct me if I'm wrong.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 déc. 2012 à 11:40, Vaidik Kapoor kapoor.vaidik@gmail.com a écrit :

I am new to ElasticSearch and I am evaluating it for a project.

In ES, Replication can be sync or async. In case of async, the client is returned success as soon as the document is written to the primary shard. And then the document is pushed to other replicas asynchronously.

When written asynchronously, how do we ensure that when GET is done, document is returned even if it has not propagated to all the replicas (just in case we do a GET immediately after PUTing the document). Because when we do a GET in ES, the query is forwarded to one of the replicas of the appropriate shard. Provided we are writing asynchronously, the primary shard may have the document but the selected replica for doing the GET may not have received/written the document yet. In Cassandra, we can specify consistency levels (ONE, QUORUM, ALL) at the time of writes as well as reads. Is something like that possible for reads in ES?

--

--

Provided we are writing asynchronously, the primary shard may have the
document but the selected replica for doing the GET may not have
received/written the document yet.

You can specify the "preference" in the Get API, see
http://www.elasticsearch.org/guide/reference/api/get.html

In Cassandra, we can specify consistency levels (ONE, QUORUM, ALL) at the
time of writes as well as reads. Is something like that possible for reads
in ES?

You can specify the "write consistency" in a similar way with
Elasticsearch, see
http://www.elasticsearch.org/guide/reference/api/index_.html

Karel

--

There are different things:

  • replication type
  • write consistency
  • read consistency

By default, Elasticsearch indexing uses replication type "sync". The
parameter is "action.replication_type". This ensures the execution of
replication actions across all participating nodes (replicated shards)
before indexing operation returns. An alternative is "async", which starts
replication actions in separate threads and does not wait for answers from
the nodes. See https://github.com/elasticsearch/elasticsearch/issues/196

There is also a write consistency level, which controls the success of the
write executions in a distributed system. The parameter is
"action.write_consistency". By default, it is set to "quorum". Write
consistency may be given even if not all writes to all shard have
succeeded, for example if at least half of the replica level is met. If
such a quorum is not fulfilled, indexing returns with an error after a
timeout. Other values are "one" or "all".
See https://github.com/elasticsearch/elasticsearch/issues/444

The write consistency across node failure situations is ensured with a
"transaction log" or translog in write-ahead style, where on each node all
write operations are registered in a separate file before they got executed
at shard level.

Note, read consistency is different. For doing read consistency, each
Lucene index reader would have to reopen the index to get the most current
index state, which is an expensive operation. Lucene offers "near real
time" search (NRT) to improve the situation. This works by using the
IndexWriter buffer as an additional segment in search operations.
Elasticsearch makes use of it by default. The parameter
"action.get.realtime" is set to true by default. This means, you won't have
to refresh the index in order to use a "get" when you want to read what you
write.

To let all other readers read what you have written, the Elasticsearch
buffers for an index refresh regularly, and all readers on that index will
obtain the current state. The parameter is "index.refresh_interval". By
default, the interval is "1s" (one second).

To guarantee read consistency, you must refresh the Elasticsearch index
with the parameter "refresh".

Another method would be blocking all reads until the next refresh has
happened. Bad for performance, good for transactional-style loving DB
folks. An issue is open for this

Jörg

--

1 Like