I'm using elastic 1.7.5. My use case includes a single well provisioned server with two ES instances running on it. I am indexing events for near real-time search and aggregation. There are no updates or deletes. Events are 3K records, and I've observed up to 100Keps.
I am unable to index at that rate due to various issues with bulk indexing -- one of which is index throttling (described below). My server is limited to a RAID6 with legacy magnetic disks. The server is continuously indexing these events through a transport client using the Bulk API. Unfortunately, at this time, I am currently limited to this server, and upgrading to ES 2.0 is not an option I can currently consider. 8 shards, daily indices.
One of the issues that I am currently trying to eliminate is that elastic is itself reducing its net indexing rate by throttling. Regardless of the max_merge_count I select, I seem to encounter throttling. You'll see below that even if I bump the merge count way up to 20, I still hit this limitation almost constantly.
Disk I/O utilization reported by sar/iostat doesn't seem to increase much above 5%.
2016-03-14 11:41:18,433 [INFO ][index.engine ] [clusterhost-1] [events-2016.03.14][0] now throttling indexing: numMergesInFlight=21, maxNumMerges=20
2016-03-14 11:41:18,583 [INFO ][index.engine ] [clusterhost-2] [events-2016.03.14][5] now throttling indexing: numMergesInFlight=21, maxNumMerges=20
2016-03-14 11:41:18,588 [INFO ][index.engine ] [clusterhost-1] [events-2016.03.14][0] stop throttling indexing: numMergesInFlight=19, maxNumMerges=20
2016-03-14 11:41:18,673 [INFO ][index.engine ] [clusterhost-2] [events-2016.03.14][5] stop throttling indexing: numMergesInFlight=19, maxNumMerges=20
If I am continuously receiving data, the last thing I want is for ES to stop indexing. What's at work here? How can I support continuous indexing without hitting throttling? Are there any changes to standard merge/segment guidance when using RAID?
# magnetic disk
index.merge.scheduler.max_thread_count: 1
# increased from 8, to 16, to 20
index.merge.scheduler.max_merge_count: 20
# magnetic disk
index.store.throttle.type: none
# index settings
"index.merge.policy.max_merged_segment": "8gb",
"index.merge.policy.segments_per_tier": 20,
index.refresh_interval: 30s
index.merge.policy.max_merge_at_once: 20
Your help and expertise is appreciated.