I see a big difference in indexing throughput between small documents and large documents. Is this expected and why under the below test conditions?
- Small documents are 1KB, large documents are 10KB to 30KB
- Observed throughput is 3MB/s vs 20MB/s (not able to beat 4000 documents/sec)
- Bulk size is 300 documents; There is no improvement in performance beyond this
- Refresh is disabled (-1)
- Not analyzing any field
- Index buffer and translog are sized appropriately
- Disk storage, no replication, 1 shard
- No big difference with auto-id
BTW, refresh still happens when index buffer is half full (index buffer must have a ping pong implementation).
I would like to understand what is the per document processing overhead (including per field) and where are the bottlenecks.