Slow log took time questions

Hi

I have some questions about the slow log took time

I want to only get the process time in elasticsearch server using esrally to do the benchmark

So I opened the slow log like below

PUT /myIndex/_settings
{
    "index.search.slowlog.threshold.fetch.trace": "0s",
    "index.search.slowlog.threshold.query.trace": "0s",
}

then I can find all the slow logs in the elastic-cloud-logs* index, but if I run a query ,there will be two documents logged , one component is query and another is fetch, so which time should I use as the total processing time in es including both query and fetch phase

query phase log

{
 "slowlog": {
              "cluster": {},
              "elasticsearch.slowlog.search_type": "QUERY_THEN_FETCH",
              "elasticsearch.slowlog.took": "278.3micros",
              "elasticsearch.slowlog.message": "[5c43591a_608d_493f_94aa_5da18abecabc][0]",
              "elasticsearch.slowlog.stats": "[]",
              "elasticsearch.slowlog.source": "{}",
              "elasticsearch.slowlog.id": "null",
              "node": {},
              "elasticsearch.slowlog.total_shards": "1",
              "elasticsearch.slowlog.took_millis": "0",
              "elasticsearch.slowlog.total_hits": "10001+ hits"
            }
}

fetch phase log

{
"slowlog": {
              "cluster": {},
              "elasticsearch.slowlog.search_type": "QUERY_THEN_FETCH",
              "elasticsearch.slowlog.took": "1.4ms",
              "elasticsearch.slowlog.message": "[5c43591a_608d_493f_94aa_5da18abecabc][0]",
              "elasticsearch.slowlog.stats": "[]",
              "elasticsearch.slowlog.source": "{}",
              "elasticsearch.slowlog.id": "null",
              "node": {},
              "elasticsearch.slowlog.total_shards": "1",
              "elasticsearch.slowlog.took_millis": "1",
              "elasticsearch.slowlog.total_hits": "10001+ hits"
            }
}

Does the "elasticsearch.slowlog.took": "278.3micros" includes both the query and fetch phase time or need to add the time in fetch phase ?

Thanks

Hello @tengfei225!

You should use the sum of both times. In that case that would be 1.4ms + 278.3micros, that is 1.4ms + 0.3ms or 1.7ms. I would expect the took time reported by the request result to be 2 as 1.7ms rounds to 2ms.

Many Rally operations return that result already, so if you configure metrics (Metrics - Rally 2.8.0 documentation) you may not have to configure slowlog. Search for took in Track Reference - Rally 2.8.0 documentation to see all operations that store the took time. (In some cases enabling detailed-results is needed though.)

That said, took is a server metric, and in most cases you will care more about the latency as seen by the client. See Frequently Asked Questions (FAQ) - Rally 2.8.0 documentation for more details.

Thanks @Quentin_Pradet , your reply helps a lot .
Is this reasonable if I separate the query and fetch phase, like I only use fetch phase took time to benchmark the _source/doc_value/stored since we have some test cases about which one from _source/doc_value/stored is better to retrieve data ? Will this way be more accurate ?

Well, no, using fetch phase isn't a very accurate measurement. It will exclude various things:

  • Query phase
  • Possibly time spent in queues inside Elasticsearch (I don't know the details enough to tell)
  • Time needed to send the request to Elasticsearch
  • Time needed to serialize the JSON response
  • Time needed to send the response to a client

See Search API | Elasticsearch Guide [8.8] | Elastic for the last three bullets.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.