I am trying to load test my elasticsearch instance to figure out the optimal bulk size. Below is my setup
- 1 elasticsearch node running the latest (2.4)
- 32 GB heap size
- 1 index, 1 shard, 0 replicas
- refresh interval = -1
- indices.memory.index_buffer_size: 30
- index.translog.flush_threshold_size:10000
- Mapping is around ~ 20 fields, all not analyzed, stored with lowercase mapping.
I tested with 30 parallel workers doing bulk indexing batch sizes of 100, 250, 500, 1000 and each document is roughly around 250 bytes I found that I get the same performance for all the batch sizes, it just takes proportionately longer to index. I get around at around 60k inserts/sec. CPU however increases from ~ 30% to 60% (across all cores).
- Is this expected ? The documentation suggests to start testing at 5 MB or but when I try that it elasticsearch just takes way too long to respond.
- what batch should I choose in this case ? I am guessing the lower once as the call returns quickly and consumes less memory.
- Are there any other settings that I can tweak to get more performance ?
- I understand performance varies depending on the setup but is 60K/sec reasonable for this setup ? It more than suffices for our use case but I am trying to get a good benchmark.