In my experience, indexing is often quite CPU intensive, and CPU generally gets saturated before disk I/O. Exactly how CPU intensive it is will depend on the indexing throughput as well as the structure and size of the documents as well as the mappings used. If you need to reduce CPU usage, e.g. in order to be able to serve queries within a certain latency, you may need to reduce the indexing throughput or scale out the cluster.
The more the CPU, the faster ES is. So using CPU cycles is perfect, and 85% is nothing! This means system load is less than 1.0. (Note that my bulk sessions can rise up to 8.0 -12.0 system load)
You can look at your client and decrease bulk request length and bulk request frequency, you can even implement pausing between bulk requests. This is very easy and will slow down indexing.
If you want advanced configuration, in 1.5.2 you can use store throttling to slow down ES nodes, even if clients do not want that.
The motivation was to solve issues when slow disks block CPU on ES servers. On modern hardware, this is no longer the case. Be aware store throttling is removed in ES 2.x, it was an advanced setting, and must be handled carefully. For example, all nodes should throttle at same rate.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.