The primary dev managing our ES cluster has made the statement that single
document writes to ES will only provide us with roughly 30 / 40 writes a
second. Whereas the bulk operations will give us more in the range of a
1,000+. I realize that bulk is always faster (or is generally) and there
are hardware / environment constraints to any process. However, with other
technologies you do not pay such a heavy price for single insertions. I am
obviously ignorant when it comes to ES, but why do you pay such a heavy
price for document writes in ES? Or are we just not properly informed?
Environment:
Apache Storm writes to our ES cluster
Currently all of the writes are processed in bulk operations.
From what I understand (which may not be 100% right), most of the overhead
is with generating and dealing with the HTTP request as it's a heavy
operation.
The primary dev managing our ES cluster has made the statement that single
document writes to ES will only provide us with roughly 30 / 40 writes a
second. Whereas the bulk operations will give us more in the range of a
1,000+. I realize that bulk is always faster (or is generally) and there
are hardware / environment constraints to any process. However, with other
technologies you do not pay such a heavy price for single insertions. I am
obviously ignorant when it comes to ES, but why do you pay such a heavy
price for document writes in ES? Or are we just not properly informed?
Environment:
Apache Storm writes to our ES cluster
Currently all of the writes are processed in bulk operations.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.