Retrieving top N hits from nested documents across all matching documents


I'm currently working with an Elasticsearch index where each document contains a nested field embedingContent representing "chunks" of the document. Each chunk has its own vector embedding, and I want to perform a vector similarity search across these chunks.

Here's a sample of the mapping for the embedingContent field:

"embedingContent" : {
    "type" : "nested",
    "properties" : {
        "content" : {
            "type" : "text",
            "fields" : {
                "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
        "contentTokens" : {
            "type" : "long"
        "embeddedString" : {
            "type" : "text",
            "fields" : {
                "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
        "embedding" : {
            "type" : "dense_vector",
            "dims" : 1536
        "id" : {
            "type" : "long"
        "newsArticleId" : {
            "type" : "long"
        "parentId" : {
            "type" : "long"
        "splitId" : {
            "type" : "long"

I want to run a query that retrieves the top 20 most relevant chunks across all documents, based on the cosine similarity of their embeddings to a query vector. However, I'm finding it difficult to do this because the `size` parameter in `inner_hits` only limits the number of chunks per document, not the total number of chunks across all documents.

Here's the Elasticsearch query I'm currently using:

    "query": {
        "nested": {
            "path": "embedingContent",
            "query": {
                "script_score": {
                    "query": {"match_all": {}},
                    "script": {
                        "source": "cosineSimilarity(params.query_vector, 'embedingContent.embedding') + 1.0",
                        "params": {"query_vector": [0.1, 0.2, 0.3, ...]}  // Example query vector
            "inner_hits": {
                "size": 20
    "_source": false

Does anyone know of a way to limit the total number of inner hits (chunks) returned across all documents? Any help would be greatly appreciated.


