Search error rate 100

The search could be erroring out for many reasons, the cluster may not healthy etc. The easiest way to unearth the error is to add --on-error=abort to your esrally race invocation, e.g.:

esrally race --track=[...] --on-error=abort

For example, given this operation:

    {
      "name": "test",
      "operation": {
        "operation-type": "search",
        "body": {
          "query": {
            "this_isnt_a_valid_query": {}
          }
        }
      }
    },

Running without --on-error=abort gives me logs that look like this:

2023-07-26 00:35:40,41 ActorAddr-(T|:51842)/PID:876 esrally.driver.driver INFO Worker[0] executing tasks: ['test']
2023-07-26 00:35:40,69 ActorAddr-(T|:51842)/PID:876 esrally.driver.driver INFO Worker[0] finished executing tasks ['test'] in 0.026799 seconds
2023-07-26 00:35:45,35 ActorAddr-(T|:51806)/PID:845 esrally.driver.driver INFO All workers completed their tasks until join point [1/1].

and a summary that looks like:

|                                   Total Ingest Pipeline failed |        |   0         |        |
|                                       100th percentile latency |   test |  25.4275    |     ms |
|                                  100th percentile service time |   test |  25.4275    |     ms |
|                                                     error rate |   test | 100         |      % |

[WARNING] Error rate is 100.0 for operation 'test'. Please check the logs.
[WARNING] No throughput metrics available for [test]. Likely cause: Error rate is 100.0%. Please check the logs.

--------------------------------
[INFO] SUCCESS (took 22 seconds)
--------------------------------

If I add --on-error=abort, I get a summary like:

Running test                                                                   [  0% done]
[ERROR] Cannot race. Error in load generator [0]
        Cannot run task [test]: Request returned an error. Error type: api, Description: <_io.BytesIO object at 0x1075d4270> ({"error":{"root_cause":[{"type":"parsing_exception","reason":"unknown query [this_isnt_a_valid_query]","line":1,"col":37}],"type":"parsing_exception","reason":"unknown query [this_isnt_a_valid_query]","line":1,"col":37,"caused_by":{"type":"named_object_not_found_exception","reason":"[1:37] unknown field [this_isnt_a_valid_query]"}},"status":400})

Getting further help:
*********************
* Check the log files in /home/b-deam/.rally/logs for errors.
* Read the documentation at https://esrally.readthedocs.io/en/latest/.
* Ask a question on the forum at https://discuss.elastic.co/tags/c/elastic-stack/elasticsearch/rally.
* Raise an issue at https://github.com/elastic/rally/issues and include the log files in /home/b-deam/.rally/logs.

If you don't want to cancel benchmark execution on any error, but still want to work out where your error is coming from then you'll want to adjust the logging threshold to capture the search request response.

You'll find the logger configuration in ~/.rally/logging.json, but depending on which version of Rally you're using will depend on the logger you want to modify.

For versions < 2.7.1 you'll want to change the elasticsearch logger from WARNING to something like INFO or even DEBUG:

    "elasticsearch": {
      "handlers": [
        "rally_log_handler"
      ],
      "level": "WARNING",
      "propagate": false
    },

For versions > 2.7.1 you'll want to adjust the elastic_transport logger:

    "elastic_transport": {
      "handlers": [
        "rally_log_handler"
      ],
      "level": "WARNING",
      "propagate": false
    }

For example, adjusting the logger to INFO from WARNING at least gives me this in the logs:

2023-07-26 00:41:11,246 ActorAddr-(T|:51998)/PID:1948 esrally.driver.driver INFO Worker[0] executing tasks: ['test']
2023-07-26 00:41:11,276 -not-actor-/PID:1948 elastic_transport.transport INFO GET https://localhost:9200/logs-221998/_search [status:400 duration:0.028s]
2023-07-26 00:41:11,276 ActorAddr-(T|:51998)/PID:1948 esrally.driver.driver INFO Worker[0] finished executing tasks ['test'] in 0.029163 seconds
1 Like