Performance impact of the "include/exclude" fields of an aggregation

deviantony · December 11, 2015, 12:38pm

Using the following entities:

{
  "label": "Galaxy S4",
  "categoryPath": ["Smartphone/Android/5.1"]
}

{
  "label": "Galaxy S6",
  "categoryPath": ["Smartphone/Android/6.0"]
}

{
  "label": "Iphone 6s",
  "categoryPath": ["Smartphone/IOS"]
}

And the category tree for this example:

| /
| / Smartphone
| / Smartphone / Android
| / Smartphone / Android / 5.1
| / Smartphone / Android / 6.0
| / Smartphone / IOS

What I would like to do is retrieving the number of product per category level, e.g: how many products are located in the "Smartphone" category? And I expect it to return two buckets for the children categories only (Android and IOS).

I'm currently using the following query to retrieve how many products are located in the "Smartphone" category with:

GET my_index/product/_search
{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "term": {
          "categoryPath.tokenized": "/Smartphone"
        }
      }
    }
  },
  "aggs": {
    "category": {
      "terms": {
        "field": "categoryPath.tokenized",
        "size": 0,
        "include": "\/Smartphone\/.*",
        "exclude": "\/Smartphone\/.*\/.*"
      }
    }
  }
}

Mapping used:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "path_analyzer": {
          "tokenizer": "path_hierarchy"
        }
      }
    }
  },
  "mappings": {
    "product": {
      "properties": {
        "label": {
          "type": "string",
          "analyzer": "english"
        },
        "categoryPath": {
          "type": "string",
          "index": "not_analyzed",
          "doc_values": true,
          "fields": {
            "tokenized": {
              "type": "string",
              "analyzer": "path_analyzer"
            }
          }
        }
      }
    }
  }
}

Based on the topic: Aggregation on a materialized path

Topic		Replies	Views
Aggregation on a materialized path Elasticsearch	3	3563	July 5, 2017
Regexp vs Include performance comparison Elasticsearch	1	359	April 10, 2018
Terms aggregation and regex filter Elasticsearch	1	2447	July 6, 2017
Filter aggregation vs Term aggregation with filtering values Elasticsearch	1	672	July 6, 2017
Regex based query help Elasticsearch	2	416	January 4, 2017

Performance impact of the "include/exclude" fields of an aggregation

Related topics