What is request error rate?

What is request error rate? How do we find what caused these errors?

Hi @stirumale,

Rally captures metrics for each request it issues (e.g. when did the request start, how long did it take to get back a response?). Whenever a request fails, e.g. due to connection issues because the server is down or due to an HTTP error (e.g. HTTP 401, 404, 500), we treat this request as erroneous and record it in the "meta" block of the respective request metrics record.

Here is an example of a metric record for a successful request:

{
          "operation-type": "Search",
          "lap": 1,
          "environment": "nightly",
          "relative-time": 2099063018,
          "car": "defaults",
          "sample-type": "normal",
          "challenge": "append-no-conflicts",
          "value": 299.2303790524602,
          "trial-timestamp": "20170425T000047Z",
          "track": "geonames",
          "unit": "ms",
          "name": "service_time",
          "meta": {
            "distribution_version": "6.0.0-alpha1",
            "success": true,
            "source_revision": "6ebf087"
          },
          "@timestamp": 1493082110434,
          "operation": "expression"
        }

and here is an example of a metric record for an erroneous request:

{
          "lap": 1,
          "trial-timestamp": "20170421T101914Z",
          "meta": {
            "success": false,
            "error-description": """{"_scroll_id":"DnF1ZXJ5VGhlbkZldGNoBwAAAAAAAHEvFnZfZ202ODFTVEtTYVplXy1ocTljc3cAAAAAAABxMBZ2X2dtNjgxU1RLU2FaZV8taHE5Y3N3AAAAAAAAW_MWTF9IX1BTWGtScU83aWFKT01NQVBaQQAAAAAAAFv0FkxfSF9QU1hrUnFPN2lhSk9NTUFQWkEAAAAAAABb9RZMX0hfUFNYa1JxTzdpYUpPTU1BUFpBAAAAAAAAW_IWTF9IX1BTWGtScU83aWFKT01NQVBaQQAAAAAAAFv2FkxfSF9QU1hrUnFPN2lhSk9NTUFQWkE=","took":25086,"timed_out":false,"terminated_early":true,"_shards":{"total":7,"successful":0,"failed":2,"failures":[{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [28975]"}},{"shard":-1,"index":null,"reason":{"type":"search_context_missing_exception","reason":"No search context found for id [28976]"}}]},"hits":{"total":6409757,"max_score":null,"hits":[]}}""",
            "attribute_max_running_jobs": "10",
            "distribution_version": "6.0.0-alpha1",
            "http-status": 404,
            "source_revision": "47160ba"
          },
          "environment": "marathon",
          "@timestamp": 1492818845806,
          "car": "external",
          "challenge": "index-and-search",
          "operation-type": "Search",
          "sample-type": "normal",
          "operation": "search-recent-exceptions",
          "unit": "ms",
          "name": "service_time",
          "track": "logs",
          "value": 33239.725890991394,
          "relative-time": 48884448282
        }

In the summary report we just show the ratio of number of erroneous requests / number of totally recorded requests. As you've seen you can get the full error details if you setup a dedicated Elasticsearch metrics store for Rally.

I've also opened https://github.com/elastic/rally/issues/274 to explain the summary report (and specifically request error rate) in more detail in the official docs.

Daniel

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.