Bulk operations


(NevB) #1

Hi,

Does ElasticSearch support bulk PUT requests.

By that I mean something like:

curl -XPUT http://localhost:9200/ -d
'[{ _index: "twitter", _type: "user", _id: "1", _source: { name:
"Fred" }}, { _index: "twitter", _type: "user", _id: "2", _source:
{ name: "John" }}]'

This type of PUT is needed by AJAX apps which construct a list of
entries which are submitted as a single 'transaction' for example, an
order document which is comprised of multiple types

Thanks for your thoughts,

Neville


(Shay Banon) #2

ElasticSearch does not currently support bulk operations. There are several
reasons to support bulk operations, lets analyze them:

  1. Bulk operations might imply transactionality (or atomicity at the very
    least). Meaning that either all operations succeed or fail. I don't see this
    feature getting into ElasticSearch in the near future because of the
    complexity in implementing it in distributed systems. If the operations go
    to different shards, this might means a two phase commit process, and even
    if all go to the same shard, its not simple to implement on top of Lucene
    and still maintain high throughput.

  2. Bulk operations can be used to speed up processing. For example, send a
    single message with 1000 operations instead of 1000 messages. This does make
    sense for ElasticSearch (and the reason why bulking will be supported in the
    near future (no atomicity across all operations though, just a status on
    each one if it failed or not). But, you should know that ElasticSearch is
    highly optimized for concurrent usage and built using complete event driven
    IO architecture. So, if you have 1000 operations, simply fork 10 threads to
    do them, and you should get really good numbers (and I mean, really good
    numbers :slight_smile: ) for now. Moreover, if you have a good http async client lib,
    then simply send all the indexing requests on a single thread, and register
    a listener for the results.

-shay.banon

On Wed, Feb 10, 2010 at 11:30 PM, NevB neville.burnell@gmail.com wrote:

Hi,

Does ElasticSearch support bulk PUT requests.

By that I mean something like:

curl -XPUT http://localhost:9200/ -d
'[{ _index: "twitter", _type: "user", _id: "1", _source: { name:
"Fred" }}, { _index: "twitter", _type: "user", _id: "2", _source:
{ name: "John" }}]'

This type of PUT is needed by AJAX apps which construct a list of
entries which are submitted as a single 'transaction' for example, an
order document which is comprised of multiple types

Thanks for your thoughts,

Neville


(system) #3