Hi,
I also had a similar requirement. I dunno if this solution will work
for you. You can try an alternate approach.
Instead of indexing the documents directly, queue them to a message
queue. (like rabbitmq).
Have consumers which will keep reading from the queue and index the
document into elasticsearch.
This way, by de-coupling your document generation and document
indexing, you need not worry about the rate at which your documents
are being created.
Also, since your documents seem to be small, this will not be much of
an overhead on messaging systems.
If you use a framework like celery, this is done very transparently
for you. You don't have to understand (deeply) about AMQP and similar
technologies.
Assuming that you are doing this on a cloud setup, you may already
have access to a RabbitMQ setup.
Regards,
Mahendra
http://twitter.com/mahendra
On Wed, Aug 25, 2010 at 12:32 AM, elasticsearcher
elasticsearcher@gmail.com wrote:
I've searched around on the docs, and I haven't found a solution, so I
thought I'd ask here.
In my program, I generate many short documents to index very quickly (shall
we say, 1000 every few seconds, per thread, and I have many threads on many
nodes), and then insert them into Elasticsearch for indexing one-by-one
until they're gone. I believe this may be a bottleneck in my system.
Is there any way to index a large batch of documents at once (all of the
same type)?
I am currently using the REST API via python, but if this feature exists in
a different API instead, it is conceivable that I could incorporate it into
my program.
My document type looks like:
{
Name1:
Name2:
Percent:
}
I'm imagining the slowdown is simply because I have to push thousands of
documents to the cloud, one-by-one, even though I have large chunks of them
generated at once, and the overhead of individual transfers/indexing is the
bottleneck.
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Indexing-multiple-things-at-once-Possible-tp1317722p1317722.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.