The best way to improve efficiency and reduce the risk of rejection problems is often to limit the number of shards you index into. Instead of having daily indices with a large number of shards I would recommend switching to using rollover and ILM. That way you can set the number of primary shards to 5 per index (same as your number of data nodes) and have rollover switch to new underlying indices based on size (often recommended to around 50GB per shard) and/or time. In your case it probably means that you would generate multiple indices per day where each cover a shorter time period. This should be more efficient.
Indexing tends to be disk I/O intensive, so CPU is often not the limiting factor. As you are using i3 instance I suspect you should be fine though. How many concurrent clients/connections are you using for indexing?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.