Terms Aggregation vs Field Collapse for Search Suggestions at Scale (100K+ unique values)

I'm building a search suggestion feature and need help choosing between terms aggregation and field collapse for my use case.

My Dataset:

  • 3 million items in the index

  • 100,000+ unique product names (`brandName` in the mappings)

  • Users search by typing partial names (autocomplete)

  • I only need to return unique name strings (not full documents)

{
  "mappings": {
    "properties": {
      "completionField": {
        "type": "search_as_you_type",
        "max_shingle_size": 3
      },
      "product": {
        "type": "object",
        "properties": {
          "brandName": {
            "type": "text",
            "analyzer": "product_name_analyzer",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

Response Needed:

[
  { "text": "Panadol" },
  { "text": "Advil" },
  { "text": "Aspirin" }
]

Approach 1: Terms Aggregation

{
  "size": 0,
  "_source": false,
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "pana",
            "type": "bool_prefix",
            "fields": ["completionField", "completionField._2gram", "completionField._3gram"]
          }
        },
        {
          "multi_match": {
            "query": "pana",
            "fields": ["product.brandName^4"]
          }
        }
      ]
    }
  },
  "aggs": {
    "unique_brand_names": {
      "terms": {
        "field": "product.brandName.keyword",
        "size": 5,
        "order": { "max_score": "desc" }
      },
      "aggs": {
        "max_score": {
          "max": { "script": "_score" }
        }
      }
    }
  }
}

Approach 2: Field Collapse

{
  "size": 5,
  "_source_includes": ["product.brandName"],
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "pana",
            "type": "bool_prefix",
            "fields": ["completionField", "completionField._2gram", "completionField._3gram"]
          }
        },
        {
          "multi_match": {
            "query": "pana",
            "fields": ["product.brandName^4"]
          }
        }
      ]
    }
  },
  "collapse": {
    "field": "product.brandName.keyword"
  },
  "sort": ["_score"]
}

Questions:

  1. For returning only unique string values (not documents), which approach is more efficient at this scale?
  2. Which uses less memory per query?
  3. Which provides more accurate relevance ordering?
  4. Are there alternative approaches I should consider?

Benchmark results so far:
Terms aggregation is ~2x faster than collapse on average