We're currently having some indexing performance issues with one of our applications using elasticsearch 1.7.
We're writing to one index (16 shards, 21 data nodes) with 10 concurrent threads using the Java API. Bulk sizes depends on size of incoming data and since incoming data varies in size so do the bulk sizes.
As long as the bulks sizes meets our preferred max size of 8000 documents performance is great for all threads. However, as soon as incoming data load decreases for some threads and bulk size strives towards 1000 documents or less the import performance are degraded for all threads. Even for threads that have a bulk sizes of 8000.
Are there some configuration parameters one should think of with the described scenario with varying bulksize to maximise performance or do we need to even out the bulk sizes to have a stable indexing rate?