we are importing millions of documents a day via RabbitMQ over the
RabbitMQ-river into Elasticsearch 0.19.2. Works great!
The index-part of the river is configured like this:
The messages we send to RabbitMQ for getting indexed are again Bulkinserts
with up to 100 inserts in one message, the size varies from 1 to 100
The question is:
Would it be better to put only one insert per message for RabbitMQ, so
Elasticsearch always has the same amount of data to index or does this make
I am asking this because on the search-side we get quite very different
amounts of time it take to do the same search, it varies from milliseconds
to up to 20 seconds and i thought this has maybe to do with the different
bulksize that Elasticsearch has to index, sometime it takes longer,
The refresh-time for elasticsearch is set to 10s (changed from the default
1s because we experienced much lower cpu load with a higher amount).
Thank you for your help,