Error while running esrally for benchmarking es

Hi All,

Getting below error while running esrally on Elasticsearch single node

[WARNING] Error rate is 100.0 for operation 'refresh-after-index'. Please check the logs.
[WARNING] No throughput metrics available for [refresh-after-index]. Likely cause: Error rate is 100.0%. Please check the logs.

================
below error in rally logs

====================

INFO iteration-count-based schedule will determine when the schedule for [refresh-after-index] terminates.
2022-02-22 07:02:27,624 -not-actor-/PID:79 elasticsearch WARNING POST http://es:9200/_all/_refresh [status:N/A request:60.552s]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/elasticsearch/_async/http_aiohttp.py", line 291, in perform_request
    async with self.session.request(
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client.py", line 1138, in __aenter__
    self._resp = await self._coro
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client.py", line 559, in _request
    await resp.start(conn)
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 913, in start
    self._continue = None
  File "/usr/local/lib/python3.8/site-packages/aiohttp/helpers.py", line 721, in __exit__
    raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError

Hi,

Thank you for your interest in Rally!

The key here is that you're benchmarking a single-node cluster. When doing that, you need to disable replicas, otherwise your cluster health will be yellow, which will result in failures. You can read more about it in the Elasticsearch documentation: Resilience in small clusters | Elasticsearch Guide [8.0] | Elastic.

(This was wrong, please see my next answer.)

Note that benchmarking single-node clusters should be avoided as it is not a realistic scenario (it's fine for testing purposes though).

Sorry, my previous answer was wrong. I realized it because I just saw the same timeout in the Rally logs for the same refresh-after-index operation in a single-node cluster, and my replica count was correct.

The thing is that refresh is an expensive operation depending on the amount of indexed data. Additionally, starting with Elasticsearch 7.0.0, Elasticsearch does not refresh automatically unless search requests are issued, which is not usually the case for Rally benchmarks.

So maybe your timeout of 60 seconds is just too low. Can you share more about how you invoke Rally? I'm interested to see if this is a default track, what the amout of indexed data is and how you're configuring the timeout.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.