As explained before the response to a profiling request is usually quite complex even for simple queries. As mentioned under Limitations [Profile API | Elasticsearch Guide [8.2] | Elastic] the reported timings do not include all aspects of the request. And in the section about the Timing Breakdown [Profile API | Elasticsearch Guide [8.2] | Elastic] you can find the following statement: "The intention of the breakdown is to give you a feel for A) what machinery in Elasticsearch is actually eating time, and B) the magnitude of differences in times between the various components. Like the overall time, the breakdown is inclusive of all children times.". This is in line with the response I gave you during the class.
It will be quite hard, even for Lucene experts to completely understand every single number reported back in the response. And it's my guess that the aggregated numbers do not sum up, due to the distributed nature of execution, maybe double-counting timings that happen in parallel. Last but not least, even if you would be able to track down every single number in a particular response, that could again change from one minor version to the next minor version.
Elasticsearch may look simple as it's hiding a lot of its complexity. But under the hood, it's still a complex, distributed system, leveraging Lucene, which alone bears a lot of complexity. The search profile API is probably the single API of Elasticsearch that gets you closest to that complexity down to a very low local Lucene level. Therefore it is (and probably never will be) trivial to fully understand the individual numbers in the response and how those numbers sum up.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.