Filter based on the doc_count with aggregations

cbuescher · October 11, 2016, 12:43pm

Hi,

you can use the bucket_selector pipeline aggregation for this kind of filtering. In your case, the following query:

GET /cars/transactions/_search
{
   "size": 0,
   "aggs": {
      "popular_colors": {
         "terms": {
            "field": "color"
         },
         "aggs": {
            "my_filter": {
               "bucket_selector": {
                  "buckets_path": {
                     "the_doc_count": "_count"
                  },
                  "script": "the_doc_count == 2"
               }
            }
         }
      }
   }
}

Should only filter out the buckets with "doc_count" : 2. However, be aware that Pipeline aggregations work on the outputs produced from other aggregations, so the overall amount of work that needs to be done to calculate the initial doc_counts will be the same. Since the script parts needs to be executed for each input bucket, the opetation might potentially be slow for high cardinality fields (as in thousands of thousands of terms), but it should work well for relatively low cardinality fields (like colors, as in this case).

Topic		Replies	Views
How can I filter by count? / How can I apply a custom filter? Kibana	5	1355	July 6, 2017
Filter based on the doc_count with aggregations (2) Elasticsearch	2	4958	October 5, 2018
Filter by min doc count Elasticsearch	2	6668	August 16, 2019
Getting the count of filtered buckets in elasticsearch Elasticsearch	1	547	December 20, 2019
Filtering on the result of an aggregation Elasticsearch	3	508	July 5, 2017

Filter based on the doc_count with aggregations

Related topics