Bulk operations

NevB · February 10, 2010, 9:30pm

Hi,

Does ElasticSearch support bulk PUT requests.

By that I mean something like:

curl -XPUT http://localhost:9200/ -d
'[{ _index: "twitter", _type: "user", _id: "1", _source: { name:
"Fred" }}, { _index: "twitter", _type: "user", _id: "2", _source:
{ name: "John" }}]'

This type of PUT is needed by AJAX apps which construct a list of
entries which are submitted as a single 'transaction' for example, an
order document which is comprised of multiple types

Thanks for your thoughts,

Neville

kimchy · February 10, 2010, 11:04pm

Elasticsearch does not currently support bulk operations. There are several
reasons to support bulk operations, lets analyze them:

Bulk operations might imply transactionality (or atomicity at the very
least). Meaning that either all operations succeed or fail. I don't see this
feature getting into Elasticsearch in the near future because of the
complexity in implementing it in distributed systems. If the operations go
to different shards, this might means a two phase commit process, and even
if all go to the same shard, its not simple to implement on top of Lucene
and still maintain high throughput.
Bulk operations can be used to speed up processing. For example, send a
single message with 1000 operations instead of 1000 messages. This does make
sense for Elasticsearch (and the reason why bulking will be supported in the
near future (no atomicity across all operations though, just a status on
each one if it failed or not). But, you should know that Elasticsearch is
highly optimized for concurrent usage and built using complete event driven
IO architecture. So, if you have 1000 operations, simply fork 10 threads to
do them, and you should get really good numbers (and I mean, really good
numbers ) for now. Moreover, if you have a good http async client lib,
then simply send all the indexing requests on a single thread, and register
a listener for the results.

-shay.banon

On Wed, Feb 10, 2010 at 11:30 PM, NevB neville.burnell@gmail.com wrote:

Hi,

Does Elasticsearch support bulk PUT requests.

By that I mean something like:

curl -XPUT http://localhost:9200/ -d
'[{ _index: "twitter", _type: "user", _id: "1", _source: { name:
"Fred" }}, { _index: "twitter", _type: "user", _id: "2", _source:
{ name: "John" }}]'

This type of PUT is needed by AJAX apps which construct a list of
entries which are submitted as a single 'transaction' for example, an
order document which is comprised of multiple types

Thanks for your thoughts,

Neville