Elasticsearch 8.17 → 8.19 upgrade: kNN now eagerly defaults "k = size", breaking aggregation-only queries

shekhar_k · January 19, 2026, 4:50pm

I’m upgrading Elasticsearch from 8.17.3 to 8.19.10 and ran into a behavioural change with kNN + aggregations that breaks an existing use case.

What worked in 8.17.3:

We use knn inside the query DSL (bool.must) together with size: 0, because we don’t need hits, only the aggregations.

In 8.17.3, omitting k allowed aggregations to effectively run over all kNN candidates (num_candidates) across shards, as long as they passed a similarity / min_score threshold.
This let us:

keep response size small (size: 0)
run large nested aggregations
avoid artificially bounding results to top-k

What breaks in 8.19.10:
In 8.19.10, the same query returns empty aggregation buckets.
After investigation, it appears that:

k is now eagerly defaulted to size
with size: 0, this effectively becomes k = 0
aggregations then run over zero kNN hits

Setting an explicit k fixes emptiness, but introduces a hard top-k bound (e.g. k ≤ 10k), which changes semantics for us:

our previous queries aggregated over all candidates above a similarity threshold
now they are strictly bounded to top-k neighbours

Our use case is aggregation-heavy (nested + reverse_nested) and the kNN stage is only meant to define the candidate set, not to limit results to top-k.
In practice, the number of documents above the similarity threshold can vary from 1k to >1M, and we need aggregations to reflect that set.

Question

Is this behaviour change intentional (possibly related to “eager defaulting of k”)?
Is there a supported way in 8.19+ to:
- keep size: 0
- avoid hard top-k truncation
- and still aggregate over the full kNN candidate set (num_candidates)?

Sample Snippet

{
  "query": {
    "function_score": {
      "boost_mode": "replace",
      "functions": [
        {
          "script_score": {
            "script": {
              "source": "_score / params.total_boost",
              "params": {
                "total_boost": 1
              }
            }
          }
        }
      ],
      "min_score": 0.5,
      "query": {
        "bool": {
          "filter": [],
          "must": [
            {
              "knn": {
                "field": "embeddings_768_bgebase",
                "query_vector": [],
                "num_candidates": 3500,
                "boost": 1,
                "similarity": 0
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "total_hits_bucket": {
      "filter": {
        "match_all": {}
      },
      "aggs": {
        "score_filters": {
          "range": {
            "ranges": [
              {
                "from": 0.6
              }
            ],
            "script": {
              "source": "_score"
            }
          }
        }
      }
    }
  },
  "from": 0,
  "size": 0
}

Carlos_D · January 28, 2026, 6:18pm

Hi @shekhar_k , sorry for the late reply.

our previous queries aggregated over all candidates above a similarity threshold

To clarify, that is not what happened before. Aggregations are done over the top k documents from each shard - it just happened that when not specified, k defaulted to num_candidates before.

According to the docs:

knn query calculates aggregations on top k documents from each shard. Thus, the final results from aggregations contain k * number_of_shards documents.

Now that k defaults to size, you should be able to get the same results by specifying k as num_candidates to get the same behavior.

Hopefully that makes sense, and you can get back to your previous behavior!

Topic		Replies	Views
Why "knn_query" doesn’t have a separate k parameter? Elasticsearch vector-search	13	727	May 9, 2024
How to add aggregions in KNN search? Elasticsearch	5	550	September 15, 2022
Inconsistent hybrid search hits total results lead to incorrect aggregations Elasticsearch vector-search	2	199	November 18, 2024
Elastic search sub_searches with knn and rrf Elastic Search	4	458	May 22, 2024
Size parameter ignored in nested knn search Elasticsearch vector-search	3	380	February 20, 2024

Elasticsearch 8.17 → 8.19 upgrade: kNN now eagerly defaults "k = size", breaking aggregation-only queries

Related topics