Debugging scroll benchmark

Hi,

I'm running a scroll query benchmark to fetch 200k rows from ES, but the ops/s achieved tells me something is wrong (80 ops/s on 4 cores). Also I added another test with 500k rows and it landed on the same ops/s. I'm expecting less than 1 ops/s and declining perf the more pages. So how do I debug this? Can I run rally with a debug mode to output the raw response from ONE query in ONE named operation?

My operation is defined as:

{
  "name": "batch-monthrange-country-200k",
  "operation-type": "search",
  "pages": 200,
  "results-per-page": 1000,
  "param-source": "daterange-query-source"
}

The parameter source is:

class DateRangeMonthBatchQueryParamSource(QueryParamSource):
 def params(self):
    month = random.randint(12, 480)
    result = {
        "body": {
            "query": {
                "bool": {
                    "filter": [
                        {
                            "range" : {
                                "transdate" : {
                                    "gte" : "now-%sM/M" % month,
                                    "lt" : "now-%sM/M" % (month-1)
                                }
                            }
                        }
                    ]
                }
            }
        },
        "index": None,
        "type": None,
        "use_request_cache": self._params.get("use_request_cache", False)
    }
    return result

I.e. we randomize what month range and country to filter on. What I suspected is that all queries return 0 or few hits instead of hundreds of thousands. So I tried to change it to a 12-month range to make sure we have enough docs, and got it down to 24 ops/s. But still I see no difference between asking for 50, 200 or 500 pages.

So my question is, how can I debug what is actually happening, what queries are run and perhaps get some stats on how many results the queries returned on average?

Is the ops/s for a scroll query meant to be number of complete searches including all pages, or do you count one op per page?

Hi @janhoy,

"ops/s" for scroll queries returns the number of pages that have been retrieved per second. Unfortunately, this is undocumented at the moment but I've opened #287 to fix this.

I've recently added more meta-data for scroll queries which will be included in the upcoming release 0.5.4. The request meta-data will then also include a hits property that will show the total number of hits for the scroll query. You cannot directly "debug" this but you can configure a dedicated Elasticsearch metrics store and then analyze the raw requests (The request meta-data are available for all service_time and latency samples).

Daniel

Thanks for the answer. Then it makes perfectly sense that ops/s stays the same for different #pages.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.