Couchdb river index performance slows down after a few hours


I asked about this long ago and got no traction so I thought let me try
again but with new info. We are trying to import ~400m docs from couchdb
to ES using the couchdb-river. Things start out super fast (2,000 docs/s
according to bigdesk) but after a few hours (4-12 hr) it will fall down to
50 docs/s. Restarting ES and boom back to the fast indexing. I am using
the following settings for couchdb river

    "bulk_size": "2500",
    "bulk_timeout": "40ms"

And I have tuned (i think) ES to be as fast as possible for the indexing.
I have followed the info
already and set number_of_shards: 20, number_of_replicas:
0, indices.memory.index_buffer_size: 65 in attempts to help the indexing.
The box is also pretty beefy, its a m1.xlarge with a 2000IOPS EBS mounted
to it and I am using the ES cookbook so it already setup the HEAP
correctly. Watching things like disk and cpu have power to spare.

Any idea what I am missing? I am pretty sure others have used the
couchdb-river to import large number of docs at decent speed.


You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
For more options, visit