Elasticsearch bulk size/performance

I am trying to load test my elasticsearch instance to figure out the optimal bulk size. Below is my setup

  • 1 elasticsearch node running the latest (2.4)
  • 32 GB heap size
  • 1 index, 1 shard, 0 replicas
  • refresh interval = -1
  • indices.memory.index_buffer_size: 30
  • index.translog.flush_threshold_size:10000
  • Mapping is around ~ 20 fields, all not analyzed, stored with lowercase mapping.

I tested with 30 parallel workers doing bulk indexing batch sizes of 100, 250, 500, 1000 and each document is roughly around 250 bytes I found that I get the same performance for all the batch sizes, it just takes proportionately longer to index. I get around at around 60k inserts/sec. CPU however increases from ~ 30% to 60% (across all cores).

  1. Is this expected ? The documentation suggests to start testing at 5 MB or but when I try that it elasticsearch just takes way too long to respond.
  2. what batch should I choose in this case ? I am guessing the lower once as the call returns quickly and consumes less memory.
  3. Are there any other settings that I can tweak to get more performance ?
  4. I understand performance varies depending on the setup but is 60K/sec reasonable for this setup ? It more than suffices for our use case but I am trying to get a good benchmark.
1 Like

As far as I recall the documentation recommends a maximum bulk size of around 5MB, not to start at that point. A common methodology to determine the optimal bulk size is to start small and increase while throughput keeps improving. In benchmarks I have performed I am usually able to saturate a node with considerably fewer parallel indexing threads, so unless 30 is a requirement, you may want to benchmark with fewer indexing threads as well.

Yes, that seems like a good choice.

Elasticsearch 2.x does a lot of optimisations behind the scenes, so there are fewer parameters that needs tuning compared to earlier versions. In the benchmarks I did for my talk at Elastic{ON} I tested with varying number of shards, which can make a difference.

That seems to be a good number, but this always depends a lot on the number of CPU, disk performance, type and size of data as well as mappings used. Sustaining max indexing rate does however leave very little resources for querying. I therefore always recommend benchmarking with a combined realistic indexing and query load to find the practical limit for a cluster.

1 Like