HI!
Index is very simple. We are essentially using ES and a Key-Value pair for now, only one field is indexed. Each doc is around ~1k when indexed (index size, #docs).
CPU util is 50%-70%. IO await is ~5ms which is not too bad, and disk IO is around 100IOPS.... yet bulk.queue size starts growing and requests get rejected. So not quite sure what is the bottleneck.
We never write to the same index we read from. The main effect of indexing is a very large increase in 95 percentile read latency (10ms to 120ms). But I suppose that makes sense - the reads that have to go to disk are slower because of disk IO contention.
We'll try to set index.merge.scheduler.max_thread_count: 1. Also working on a proper retry policy in he client, this way we can decrease the bulk thread count to limit indexing load without dropping requests