Terms aggregation slower than scripted metric

Roman_Margolis · December 22, 2015, 8:04pm

Hi.

I'm running multi layered aggregation queries on a 1.5.2 elasticsearch cluster, where the first layer is filters aggregation, with dozens, and sometimes hundreds of filters, and the second layer is a simple terms aggregation on a string field. For example:

{
   "query": {
      "filtered": {
         "filter": <some filter>
      }
   },
   "aggs": {
      "myFilters": {
         "filters": {
            "filters": {
               "f1": <filter_1>,
               "f2": <filter_2>,
                ....
               "fn": <filter_n>
            }
         }
      },
     "aggs": {
        "myTerms": {
           "terms": {
              "field": "myField",
              "size": 1000
           }
        }
     }
   }
}

This works fine.

However, I noticed that if i replace the terms aggregation with a scripted metric aggregation that does almost the same thing (term count minus the ordering) The query runs significantly faster (twice as fast, sometimes three times as fast). In absolute measuring, the latency difference between the two queries can reach a couple of dozens of seconds on a few indices which hold a few billion documents.

This revelation surprised me somewhat, since i know elasticsearch is using global ordinals for string fields to improve terms aggregation performance dramatically (compared with unbounded numerical values, for example, which i witnessed first hand in other scenarios). The scripted metric on the other hand, does not use global ordinals at all, and simply utilizes a Map to count term counts.

Can anyone explain if this behavior is to be expected, and if it is, shed some light on why this could be happening?

Thanks

Topic		Replies	Views
Terms aggregation scripts running slower than expected Elasticsearch	3	546	July 6, 2017
Strange performance of terms aggregation on nested documents Elasticsearch	1	499	February 4, 2020
Terms aggregation is slow, setting eager_global_ordinals to True did not work Elasticsearch aggregations	2	294	February 22, 2024
Heavy computational load in scripted metric aggregation Elasticsearch	2	536	February 21, 2020
Slow Terms Aggregation Elasticsearch	4	1414	May 10, 2019

Terms aggregation slower than scripted metric

Related topics