How does Elasticsearch use Lucene for aggregations?

wilqor · September 21, 2017, 10:59am

Hello,

I could not find this information while reading either Elasticsearch Definitive Guide or Elasticsearch Reference, which led me to trying out the Community.

I would like to learn how does Elasticsearch perform aggregations - particularly calculating top N most frequently occurring terms - with the knowledge that its segments are Lucene indexes. Does it make use of Lucene Facets API when creating Lucene documents? What Lucene queries are performed by Elasticsearch when calculating aggregations?

I have a local Elasticsearch node running with debugger attached using IntelliJ, so any hints on where to look in the source code would be beneficial. All kind of explanation is also welcome.

Ivan · September 21, 2017, 3:47pm

Lucene does have join queries since version 4.x [1] and Elasticsearch
abstracts the details of using those queries with the various joining
queries such as Nested and Parent/Child.

Aggregations are done using custom code within Elasticsearch and does not
use the Facets API. I believe that Solr, like Elasticsearch, also does not
use the Facets API, but my knowledge of Solr is several versions old. I
have not explored the aggregations code in detail, but I am assuming the
Elasticsearch leverages the Lucene doc_values/fielddata APIs and does not
use queries to calculate the aggregations.

[1]
https://lucene.apache.org/core/6_0_0/join/org/apache/lucene/search/join/package-summary.html

wilqor · September 22, 2017, 6:07am

Thanks Ivan!

I have not heard yet of the Lucene join queries, which is an interesting topic for me to investigate on its own.
From what I have recently found there are no taxonomy directories created for Lucene indexes, which sounds like a proof that Facets API is not used by Elastic.

system · October 20, 2017, 6:07am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Facets vs aggregations from lucene context Elasticsearch	3	4089	July 6, 2017
What exactly is Elasticsearch? Elasticsearch	4	414	December 15, 2016
Why Nested Aggregations do not perform well Elasticsearch	1	311	July 12, 2018
Facets with scores Elasticsearch	3	315	July 6, 2017
Lucene queries Elasticsearch	13	532	July 6, 2017

How does Elasticsearch use Lucene for aggregations?

Related topics