How does the kibana "search profiler" "Query Profile" timing work?

stefan.tolksdorf · May 12, 2022, 3:21pm

I use version 7.13.1

Query I profile

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "logstash"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "search_tags": "kibana"
          }
        }
      ],
      "should": [
        {
          "match_phrase": {
            "content.eng": "dead letter queue"
          }
        }
      ]
    }
  },
  "aggs": {
    "author": {
      "terms": {
        "field": "authors.last_name"
      }
    }
  }
}

How can the "Cumulative time" be 119.227ms, when all timing below not add up to this?

bhavyarm · May 12, 2022, 8:03pm

@alisongoryachev can we please get some help here?

Thanks,
Bhavya

dschneiter · May 13, 2022, 9:38am

Hi Stefan,

after having tried to answer your question during the training delivery, I did some more research and also revisited our documentation about the profile API [Profile API | Elasticsearch Guide [8.2] | Elastic]. (There is also dedicated documentation available for the Search Profiler UI [Profile queries and aggregations | Kibana Guide [8.2] | Elastic].

As explained before the response to a profiling request is usually quite complex even for simple queries. As mentioned under Limitations [Profile API | Elasticsearch Guide [8.2] | Elastic] the reported timings do not include all aspects of the request. And in the section about the Timing Breakdown [Profile API | Elasticsearch Guide [8.2] | Elastic] you can find the following statement: "The intention of the breakdown is to give you a feel for A) what machinery in Elasticsearch is actually eating time, and B) the magnitude of differences in times between the various components. Like the overall time, the breakdown is inclusive of all children times.". This is in line with the response I gave you during the class.

It will be quite hard, even for Lucene experts to completely understand every single number reported back in the response. And it's my guess that the aggregated numbers do not sum up, due to the distributed nature of execution, maybe double-counting timings that happen in parallel. Last but not least, even if you would be able to track down every single number in a particular response, that could again change from one minor version to the next minor version.

Elasticsearch may look simple as it's hiding a lot of its complexity. But under the hood, it's still a complex, distributed system, leveraging Lucene, which alone bears a lot of complexity. The search profile API is probably the single API of Elasticsearch that gets you closest to that complexity down to a very low local Lucene level. Therefore it is (and probably never will be) trivial to fully understand the individual numbers in the response and how those numbers sum up.

system · June 10, 2022, 9:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kibana search profiler accuracy Elasticsearch	4	287	November 17, 2021
Profiling : "took" is much higher than all breakdown times Elasticsearch	2	450	April 9, 2018
Time Based Index Query Performance Kibana	2	501	July 27, 2020
Query timings breakdown and performance issues Elasticsearch	4	363	July 6, 2017
Why numbers don't match when profiling an Elasticsearch slow request Elasticsearch	3	388	October 31, 2020

How does the kibana "search profiler" "Query Profile" timing work?

Related topics