Esrally benchmark run with(/out) refresh_interval param doesn't show any difference in runtime

I have done two benchmark runs using the esrally, one with default refresh interval and second with "refresh_interval" set to "-1" (disabled). Both runs took same amount of time. (382 seconds)

I am hoping that second benchmark run should complete faster as there was no refresh involved.

Here are the shard stats for the refresh interval (5sec)

"refresh" : {
"total" : 237,
"total_time_in_millis" : 75997,
"external_total" : 213,
"external_total_time_in_millis" : 73532,
"listeners" : 0
},

Here are the shard stats (refresh interval disabled):

"refresh" : {
"total" : 135,
"total_time_in_millis" : 59627,
"external_total" : 8,
"external_total_time_in_millis" : 0,
"listeners" : 0
},

When the refresh interval disabled, I am not sure why "total_time_in_mills" is recorded as "59627".

Thanks.

Hi @Mahesh3, can you share your whole invocation? We may be able to give some idea then

Hi Mahesh, welcome to the Elastic community!

Disabling refresh allows Elasticsearch to fill its entire indexing buffer during indexing. Refreshes are triggered when the buffer is full and it is time to write segments to disk.

Search requests will also trigger a refresh.

Thank you,
Jason

Hi Jason,

There were no search requests, just only bulk-indexing.

I have added the ""indices.memory.index_buffer_size" (=50%, also 100%) to elasticsearch config file , also set the heap size to 8GB (32GB RAM machine), and tried the esrallly run with refresh interval disabled, I didn't see any improvements in run timings.

Not sure exactly what else should be done speed up the indexing speed.

Here are esrally command for my test:

esrally race --preserve-install --distribution-version=7.16.0 --track=eventdata --track-repository=eventdata --challenge=index-logs-fixed-daily-volume --track-params="index_prefix:'test12',bulk_indexing_clients:12,number_of_shards:4,daily_logging_volume:'3GB',number_of_days:1,refresh_interval:'-1'"

Thanks,
Mahesh

Hi Rick.

It's a three step to run the test that I am following:

  1. Esrally install - to install the elastic
  2. Esrally start - to start the elasticsearch (before starting, edit the elasticsearch and jvm to add the respective parameters)
  3. Esrally race run , below command and it's params.

esrally race --preserve-install --distribution-version=7.16.0 --track=eventdata --track-repository=eventdata --challenge=index-logs-fixed-daily-volume --track-params="index_prefix:'test12',bulk_indexing_clients:12,number_of_shards:4,daily_logging_volume:'3GB',number_of_days:1,refresh_interval:'-1'"

Thanks,
Mahesh

If there are no search requests issued, using default settings, Elasticsearch will not periodically refresh, see the docs.

Therefore there shouldn't be any performance difference between indexing with an explicit setting of index.refresh_internal to -1 or using the defaults.

EDIT
That said, if are talking specifically about the eventdata track, the default setting for refresh_internal is 5s: GitHub - elastic/rally-eventdata-track: Rally track for simulating event-based data use-cases . If you aren't seeing a diff in indexing throughput I'd suggest you take a look at your methodology and esp. whether there are other bottlenecks (e.g. is the loaddriver, perhaps, saturated?)

The command you provided earlier indicates that the loaddriver is running on the same machine where Elasticsearch is, which is an anti-pattern for benchmarking. I recommend watching the 7 deadly sins of benchmarking video to avoid some common pitfalls).

1 Like

Yes, I aware that elasticsearch and benchmark test shouldn't be on same instance. This is just a. basic test to evaluate performance on very small dataset. In actual benchmark test, elasticsearch and esrally client nodes are on two different instance.

My understanding is that when the refersh is disabled , there should be at least 1% of performance improvement. But, that's not happening.

The refresh interval used to have a significant mpact in older versions of Elasticsearch, but due to improvements in newer versions this is not necessarily always the case any longer.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.