About RabbitMQ River


(Frederic) #1

I've read the http://www.elasticsearch.org/guide/reference/river/rabbitmq.html
page but have some questions though:

  • What would happen in case of errors when indexing a bulk of jsons?
    is that info put in the queue again (not ACK'ed)?

  • Although every River is a singleton, does it make sense to configure
    2 rivers pointing to the same queue in order to process it in parallel?


(Alberto Paro-2) #2

Sent from my iPhone

On 13/set/2011, at 21:16, Frederic focampo.br@gmail.com wrote:

I've read the http://www.elasticsearch.org/guide/reference/river/rabbitmq.html
page but have some questions though:

  • What would happen in case of errors when indexing a bulk of jsons?
    is that info put in the queue again (not ACK'ed)?

you lost all the items you have ack'ed

  • Although every River is a singleton, does it make sense to configure
    2 rivers pointing to the same queue in order to process it in parallel?
    yes to increase parallelism you can activate several rivers on the same queue: rabbitmq grant unique delivery of a message.
    hit: tune the bulk_size

(Frederic) #3

Thanks Alberto for your answer. When you say:

you lost all the items you have ack'ed

You mean that if the river takes 100 items to index and after having
indexed 50 the server fails, the other 50 will be lost? I'm not sure
how the Bulk Index process works, but if it is atomic, I guess that,
in case of failure, none of the 100 items will be ack'ed and they will
keep in the queue until the next try right?

Gracias,

On 13 sep, 16:31, Alberto Paro alberto.p...@gmail.com wrote:

Sent from my iPhone

On 13/set/2011, at 21:16, Frederic focampo...@gmail.com wrote:

I've read thehttp://www.elasticsearch.org/guide/reference/river/rabbitmq.html
page but have some questions though:

  • What would happen in case of errors when indexing a bulk of jsons?
    is that info put in the queue again (not ACK'ed)?

you lost all the items you have ack'ed

  • Although every River is a singleton, does it make sense to configure
    2 rivers pointing to the same queue in order to process it in parallel?

yes to increase parallelism you can activate several rivers on the same queue: rabbitmq grant unique delivery of a message.
hit: tune the bulk_size


(Shay Banon) #4

Bulk indexing is not atomic, the ones that were indexed are indexed, the
others are logged. When I wrote it, I added a TODO to possibly write the
failed docs to an exception queue, but never got around to do it :slight_smile: :
https://github.com/elasticsearch/elasticsearch/blob/master/plugins/river/rabbitmq/src/main/java/org/elasticsearch/river/rabbitmq/RabbitmqRiver.java#L263
.

On Wed, Sep 14, 2011 at 8:44 PM, Frederic focampo.br@gmail.com wrote:

Thanks Alberto for your answer. When you say:

you lost all the items you have ack'ed

You mean that if the river takes 100 items to index and after having
indexed 50 the server fails, the other 50 will be lost? I'm not sure
how the Bulk Index process works, but if it is atomic, I guess that,
in case of failure, none of the 100 items will be ack'ed and they will
keep in the queue until the next try right?

Gracias,

On 13 sep, 16:31, Alberto Paro alberto.p...@gmail.com wrote:

Sent from my iPhone

On 13/set/2011, at 21:16, Frederic focampo...@gmail.com wrote:

I've read thehttp://
www.elasticsearch.org/guide/reference/river/rabbitmq.html

page but have some questions though:

  • What would happen in case of errors when indexing a bulk of jsons?
    is that info put in the queue again (not ACK'ed)?

you lost all the items you have ack'ed

  • Although every River is a singleton, does it make sense to configure
    2 rivers pointing to the same queue in order to process it in parallel?

yes to increase parallelism you can activate several rivers on the same
queue: rabbitmq grant unique delivery of a message.
hit: tune the bulk_size


(Alberto Paro-2) #5

Sent from my iPhone

On 14/set/2011, at 19:44, Frederic focampo.br@gmail.com wrote:

Thanks Alberto for your answer. When you say:

you lost all the items you have ack'ed

You mean that if the river takes 100 items to index and after having
indexed 50 the server fails, the other 50 will be lost? I'm not sure
how the Bulk Index process works, but if it is atomic, I guess that,
in case of failure, none of the 100 items will be ack'ed and they will
keep in the queue until the next try right?

when the river riceive a message, it acks it and then It do the bulk. acked messages if the river dies on bulk you may lost these messagges.

the problem is during bulk filling.

Gracias,

On 13 sep, 16:31, Alberto Paro alberto.p...@gmail.com wrote:

Sent from my iPhone

On 13/set/2011, at 21:16, Frederic focampo...@gmail.com wrote:

I've read thehttp://www.elasticsearch.org/guide/reference/river/rabbitmq.html
page but have some questions though:

  • What would happen in case of errors when indexing a bulk of jsons?
    is that info put in the queue again (not ACK'ed)?

you lost all the items you have ack'ed

  • Although every River is a singleton, does it make sense to configure
    2 rivers pointing to the same queue in order to process it in parallel?

yes to increase parallelism you can activate several rivers on the same queue: rabbitmq grant unique delivery of a message.
hit: tune the bulk_size


(system) #6