How to handle error and retry/recovery


(Alois Cochard) #1

Greetings,

I have some questions about error handling in ElasticSearch and didn't
found the answers in the online documentation.

When I index a document, how can I be sure that the document was
indexed ?

If a serious problem occur (let's take the out of disk space error),
how will ES return me the Exception ?

I am wondering how far ES is asynchronous, I mean:

  • Indexing a new document
  • Receive response OK (HTTP 200)
  • Can something break now preventing the cluster to index the
    document (out of space in node) ?

I other terms, how can I be sure that my document is really indexed/
updated/deleted, checking a HTTP 200 response is enough ?

And finally, perhaps RabbitMQ can help me by sending to a queue (not
yet implemented) all commands in errror... is it more reliable than
using REST API ?

Thank a lot for your help !

Alois


(Shay Banon) #2

Hi,

The index API includes both indexing the document, writing it to a transaction log, and replicating it to the replicas. Only after all are successful, it returns. If there was a problem, then a non 200 response code is returned. So, if you get 200, it means that its there.

-shay.banon
On Monday, December 13, 2010 at 2:35 PM, Alois Cochard wrote:

Greetings,

I have some questions about error handling in ElasticSearch and didn't
found the answers in the online documentation.

When I index a document, how can I be sure that the document was
indexed ?

If a serious problem occur (let's take the out of disk space error),
how will ES return me the Exception ?

I am wondering how far ES is asynchronous, I mean:

  • Indexing a new document
  • Receive response OK (HTTP 200)
  • Can something break now preventing the cluster to index the
    document (out of space in node) ?

I other terms, how can I be sure that my document is really indexed/
updated/deleted, checking a HTTP 200 response is enough ?

And finally, perhaps RabbitMQ can help me by sending to a queue (not
yet implemented) all commands in errror... is it more reliable than
using REST API ?

Thank a lot for your help !

Alois


(Alois Cochard) #3

Hello Shay,

Ok, I must admit that at first ... I wasn't sure of your claim because I made prototyping earlier this year with version 0.9.

With this version, after an 'index' operation, if I wasn't doing a 'flush' I wasn't able to find the record (directly after index) using 'get'.

That's what make me think operation isn't guaranteed by simply ensure 'index' answer is 200 (something bad could happen before auto-flush and data could be lost).

But I just made some test with 0.13.1 and there is no need to flush to get directly the indexed document. Seems you made some good improvement in your framework :wink: !

I have an other question, if I do HEAVY indexing in parrallel, at some time ES gonna put my 'index' request on hold to avoid overloading.

How ES will react, I gonna have timeouted connections ? or a specific HTTP response telling me the node is busy and can't handle anymore connections ?

I think RabbitMQ River could help me in this kind of situation by keeping operations in queue preventing ES overload... Is my reasoning correct ? what your thought on that ? ... I mean is it worth using RabbitMQ if it's ONLY to prevent that kind of situation ?

A last one... out of curiosity how ES interact with RabbitMQ to avoid overloading, can you configure how many operations can be processed in parallel (by node or whole cluster) ?

Thanks a lot for your support,

Alois Cochard


http://www.twitter.com/aloiscochard


(Shay Banon) #4

Hi,

First, the visibility of an indexed document for search requests is not immediate. The fact that you can't find it right after indexing does not means its not there. Thats the near real time aspect of it.

For heavy indexing, I suggest using the bulk API. There is still room for improvements when it comes to pushing back from ES when there is heavy load. Currently, its mainly on you, but there is an option to configure ES with a blocking thread pool where it will block when the thread pool is maxed.
On Tuesday, December 14, 2010 at 11:45 AM, Alois Cochard wrote:

Hello Shay,

Ok, I must admit that at first ... I wasn't sure of your claim because I
made prototyping earlier this year with version 0.9.

With this version, after an 'index' operation, if I wasn't doing a 'flush' I
wasn't able to find the record (directly after index) using 'get'.

That's what make me think operation isn't guaranteed by simply ensure
'index' answer is 200 (something bad could happen before auto-flush and data
could be lost).

But I just made some test with 0.13.1 and there is no need to flush to get
directly the indexed document. Seems you made some good improvement in your
framework :wink: !

I have an other question, if I do HEAVY indexing in parrallel, at some time
ES gonna put my 'index' request on hold to avoid overloading.

How ES will react, I gonna have timeouted connections ? or a specific HTTP
response telling me the node is busy and can't handle anymore connections ?

I think RabbitMQ River could help me in this kind of situation by keeping
operations in queue preventing ES overload... Is my reasoning correct ? what
your thought on that ? ... I mean is it worth using RabbitMQ if it's ONLY to
prevent that kind of situation ?

A last one... out of curiosity how ES interact with RabbitMQ to avoid
overloading, can you configure how many operations can be processed in
parallel (by node or whole cluster) ?

Thanks a lot for your support,

Alois Cochard
http://aloiscochard.blogspot.com
http://www.twitter.com/aloiscochard

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-handle-error-and-retry-recovery-tp2078382p2084571.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Alois Cochard) #5

Just in case someone come on this thread when searching about the 'near real-time' aspect of the engine...

I found in the doc a cool way to handle this manually, for example when doing Unit test I needed to add a sleep between the index and search operation... but I HATE having sleep in my unit test !

Simply configure the node with 'index.refreshIntreval' to '-1' to disable autorefresh, and do it manually.

From java API:
node.client().admin().indices().prepareRefresh("twitter").execute().actionGet();

Ref: http://www.elasticsearch.com/docs/elasticsearch/index_modules/engine/robin/


(Shay Banon) #6

You don't have to set the refresh interval to -1, you can still call refresh explicitly yourself. Almost all operations now have a refresh flag that can be set on them to do the refresh post operations (though should really just be used in testing). And, in master, there is the new versioning support).

On Wednesday, January 12, 2011 at 4:08 PM, Alois Cochard wrote:

Just in case someone come on this thread when searching about the 'near
real-time' aspect of the engine...

I found in the doc a cool way to handle this manually, for example when
doing Unit test I needed to add a sleep between the index and search
operation... but I HATE having sleep in my unit test !

Simply configure the node with 'index.refreshIntreval' to '-1' to disable
autorefresh, and do it manually.

From java API:
node.client().admin().indices().prepareRefresh("twitter").execute().actionGet();

Ref:
http://www.elasticsearch.com/docs/elasticsearch/index_modules/engine/robin/

View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-handle-error-and-retry-recovery-tp2078382p2241304.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(system) #7