Elasticsearch 6.4 filtered aggregation seemingly not running a pre_filter

Kirill_Afan · October 9, 2018, 8:43pm

Good afternoon

We are struggling to debug an aggregation we are running.
index mapping:

tags_keyword: {
   type: "keyword"
}

where tags_keyword is an array of lowercased keyword terms

query:
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        { "term": { "doc_type": "product" } },
        { "term": { "shop_id": 1 } }
      ]
    }
  },
  "aggs": {
    "tags": {
      "terms": {
        "field": "tags_keyword"
      }
    }
  }
}

We are trying to get a top 10 tags for a particular filtered query. We are using the filter clause to reduce the data set, and the aggs clause to get the aggregation. Additionally, there is routing attached to this request, and it enforces running this query against one shard.

The weird; if we run just the query clause, and no aggs, we get responses in the 20ms timeframe. If we add the aggs clause, we are getting responses in the 16-20s timeframe. Its as if the aggregation runs against the entire shard's data (13 million records), and then being filtered, instead of first filtering down the result set (816 records) and then aggregating upon it.

We tried a filtered aggregation query - removing the entire query clause and running aggs.filter with them inside instead - same result.

like so:

GET /products/_search?routing=1
{
  "size": 1,
  "profile": true,
  "aggs" : {
    "t_shirts" : {
      "filter" : { "term": { "shop_id": 1 } },
      "aggs": {
        "tags": {
          "terms": {
            "field": "tags_keyword"
          }
        }
      }
    }
  }
}

Here is the very curious part - removing the routing from the URL and running performance in kibana shows each shard's max response time in 1.7s, not 16-20s. its only when all of the shards runtimes are added together, do we add the 16-20s.

I could literally write a loop to aggregate on tags of 816 that would perform faster than 20s. It must be not pre_filtering, is this an ES bug?

What is going on here?

system · November 6, 2018, 8:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Writing aggregate with filtering Elasticsearch	5	4957	October 30, 2019
Elasticsearch Aggregations taking a long time Elasticsearch	5	2385	July 5, 2017
Bad performance on aggregations Elasticsearch	5	477	July 6, 2017
Facetted Search Misbehaving Elasticsearch	8	345	July 6, 2017
Nested aggregation slows query from ~400ms to 10s Elasticsearch	11	1455	October 8, 2018

Elasticsearch 6.4 filtered aggregation seemingly not running a pre_filter

Related topics