I have a question regarding the throughput of the _bulk inserting of
elasticsearch, running version 0.90.3 on ubuntu 13.04 64bit. I've done a
great deal of searching and have been hard pressed to find anything that
appropriately covers our needs.
Currently In our system we're looking to Index data at very high volumes,
currently we're looking at indexing, albeit an odd parent / child
relationship mapping
at https://gist.github.com/fidyeates/18cec7116926516bc033, under our
current traffic we're looking at around indexing 1,500 'details' documents
per second which equates to, on average, 12,000 documents per second needed
to be indexed, at around 2-3kb size on average per document, roughly giving
us a write rate of 30-40mb / second into elasticsearch,
Our elasticsearch deployment is currently 2 m1.large amazon EC2 instances,
configured with default configurations, 10 shards and 0 replicas. We are
also rotating keys daily and flushing data thats older then two weeks.
However, this does seem to be the capacity of this elasticsearch
deployment, i.e. we're waiting for long amounts of time on all the _bulk
calls (tested on bulk amounts from 500-10,000 documents), and we're
currently feeding in from 3 parallel threads with the calls evenly
distributed across the es nodes.
1.) Are there any good performance tweaks we can make to the cluster to
increase the amount of indexing throughput? I can understand if we have hit
the capacity of two m1.larges (2 core, 4ECU's, 7.5gb ram).
2.) This is all disregarding search performance, as we will also be wanting
to make queries on this data, can we expect to disrupt the _bulk
performance significantly by running queries on 'old' indexes?
3.) Just any tips or a point in the right direction regarding documentation
(the elasticsearch docs are great though!) about this specific area would
be much obliged!
Cheers,
Fin
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.