Not able to get more than top 4 result matches, regardless of configuration

Hello everyone,

I am using ES Enterprise version 8.18.1 (I cannot change it for the time being BTW, as that is managed by other department). I created a Vector Database - like ES Vector Store, with the following Mappings (currently, it is a dummy with 1000 records in it):

{
  "mappings": {
    "dynamic": "true",
    "dynamic_templates": [
      {
        "all_text_fields": {
          "match_mapping_type": "string",
          "mapping": {
            "analyzer": "iq_text_base",
            "fields": {
              "delimiter": {
                "analyzer": "iq_text_delimiter",
                "type": "text",
                "index_options": "freqs"
              },
              "joined": {
                "search_analyzer": "q_text_bigram",
                "analyzer": "i_text_bigram",
                "type": "text",
                "index_options": "freqs"
              },
              "prefix": {
                "search_analyzer": "q_prefix",
                "analyzer": "i_prefix",
                "type": "text",
                "index_options": "docs"
              },
              "enum": {
                "ignore_above": 2048,
                "type": "keyword"
              },
              "stem": {
                "analyzer": "iq_text_stem",
                "type": "text"
              }
            }
          }
        }
      }
    ],
    "properties": {
      "course_id": {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
          },
          "enum": {
            "type": "keyword",
            "ignore_above": 2048
          },
          "joined": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "i_text_bigram",
            "search_analyzer": "q_text_bigram"
          },
          "prefix": {
            "type": "text",
            "index_options": "docs",
            "analyzer": "i_prefix",
            "search_analyzer": "q_prefix"
          },
          "stem": {
            "type": "text",
            "analyzer": "iq_text_stem"
          }
        },
        "analyzer": "iq_text_base"
      },
      "course_name_backup": {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
          },
          "enum": {
            "type": "keyword",
            "ignore_above": 2048
          },
          "joined": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "i_text_bigram",
            "search_analyzer": "q_text_bigram"
          },
          "prefix": {
            "type": "text",
            "index_options": "docs",
            "analyzer": "i_prefix",
            "search_analyzer": "q_prefix"
          },
          "stem": {
            "type": "text",
            "analyzer": "iq_text_stem"
          }
        },
        "analyzer": "iq_text_base"
      },
      "embedding": {
        "type": "dense_vector",
        "dims": 768,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "int8_hnsw",
          "m": 16,
          "ef_construction": 100
        }
      },
      "id": {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
          },
          "enum": {
            "type": "keyword",
            "ignore_above": 2048
          },
          "joined": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "i_text_bigram",
            "search_analyzer": "q_text_bigram"
          },
          "prefix": {
            "type": "text",
            "index_options": "docs",
            "analyzer": "i_prefix",
            "search_analyzer": "q_prefix"
          },
          "stem": {
            "type": "text",
            "analyzer": "iq_text_stem"
          }
        },
        "analyzer": "iq_text_base"
      },
      "language": {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
          },
          "enum": {
            "type": "keyword",
            "ignore_above": 2048
          },
          "joined": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "i_text_bigram",
            "search_analyzer": "q_text_bigram"
          },
          "prefix": {
            "type": "text",
            "index_options": "docs",
            "analyzer": "i_prefix",
            "search_analyzer": "q_prefix"
          },
          "stem": {
            "type": "text",
            "analyzer": "iq_text_stem"
          }
        },
        "analyzer": "iq_text_base"
      },
      "main_text": {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
          },
          "enum": {
            "type": "keyword",
            "ignore_above": 2048
          },
          "joined": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "i_text_bigram",
            "search_analyzer": "q_text_bigram"
          },
          "prefix": {
            "type": "text",
            "index_options": "docs",
            "analyzer": "i_prefix",
            "search_analyzer": "q_prefix"
          },
          "stem": {
            "type": "text",
            "analyzer": "iq_text_stem"
          }
        },
        "analyzer": "iq_text_base"
      },
      "provider": {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
          },
          "enum": {
            "type": "keyword",
            "ignore_above": 2048
          },
          "joined": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "i_text_bigram",
            "search_analyzer": "q_text_bigram"
          },
          "prefix": {
            "type": "text",
            "index_options": "docs",
            "analyzer": "i_prefix",
            "search_analyzer": "q_prefix"
          },
          "stem": {
            "type": "text",
            "analyzer": "iq_text_stem"
          }
        },
        "analyzer": "iq_text_base"
      }
    }
  }
}

Plus, I created the following dummy querying Python script:

import time

from google import genai
from google.genai.types import EmbedContentConfig
from elasticsearch import Elasticsearch


# MODEL = "text-embedding-005"
MODEL = "text-multilingual-embedding-002"
INDEX_NAME = "my_index_name"

client = genai.Client(
    vertexai=True,
    project="my-gcp-project",
    location="us-central1"
)


# ------------------------------------ UDFs ------------------------------------

def get_embeddings_single(input_text: str, model:str=MODEL):

    response = client.models.embed_content(
        model=model,
        contents=input_text,
        config=EmbedContentConfig(
            task_type="SEMANTIC_SIMILARITY",  # Optional
            output_dimensionality=768,  # Optional
        ),
    )
    return response.embeddings[0].values

# ---------------------- CONNECT TO ES VECTOR STORE ----------------------------

ELASTICSEARCH_VDBS_INFO = {
    "course_info_eng": {
        "url": "https://ccdcae5f73a84360909c59bc6cc8cc0a.us-central1.gcp.cloud.es.io:443",
        "key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=="
    },
    "course_info_spa": {
        "url": "https://ccdcae5f73a84360909c59bc6cc8cc0a.us-central1.gcp.cloud.es.io:443",
        "key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxy=="
    },
}

t1 = time.time()
es = Elasticsearch(
    ELASTICSEARCH_VDBS_INFO["course_info_eng"]["url"],
    api_key=ELASTICSEARCH_VDBS_INFO["course_info_eng"]["key"]
)
t2 = time.time()
print(
    "Connection to ES completed on {:.2f} seconds".format(t2-t1)
)


# ----------------------- QUERY VECTOR DATABASE --------------------------------

query_text = "skill: Post Sound Techniques; job title: Management Consultant"
# providers = ["UVA LinkedIn Learning", "EDX"]  # DO NOT USE HERE
k=10

t1 = time.time()

query_vector = get_embeddings_single(query_text)
search_query = {
    "knn": {
        "field": "embedding",
        "query_vector": query_vector,
        "k": 3,  # Number of nearest neighbors to return
        "num_candidates": 100,  # Number of candidates to consider
        "similarity": 0
    },
    "size": 50
}
response = es.search(index=INDEX_NAME, body=search_query)
t2 = time.time()

print("Queried {} elements in {:.2f} seconds".format(len(response), t2-t1))

As far as I understood this other similar case [link], I should already be obtaining the top k=3 results, but regardless of what I do, I only get 4 (FOUR) results every time. My only suspicion is that maybe something changed between ES version 8.18.1 & 8.5.0 regarding KNN querying, but I am not sure exactly what. Notice I have already attempted the following:

  • Set "k"=3 ==> I get 4 results
  • Set "k"=10 ==> I get 4 results
  • Delete "similarity"=0 ==> I get 4 results

What is going on here?

Hello!
The k parameter doesn't limit the result, as the documentation says:

The number of nearest neighbors to return from each shard. Elasticsearch collects k results from each shard, then merges them to find the global top results.

Meaning if you have more than one shard, which is common, there will be more than 3 neighbors!

What actually limits the result is the size parameter, so you can try setting that to 3 and it should work.

Not sure why you're not getting more than 4 results though, do you have exactly 4 documents in that index?

Hello and thanks for posting.

I have been a bit busy today, but yeah, after troubleshooting this problem a bit more carefully, I got to realize about what you mention. In fact, the search_query that worked in the end was:

# The rest of the code...

k=10

query_vector = get_embeddings_single(query_text)
search_query = {
    "size": k,
    "query": {
        "bool": {
            "must": [
                {
                    "knn": {
                        "field": "embedding",
                        "query_vector": query_vector,
                        "k": k*10,  # At least this much to make it functional
                        "num_candidates": k*10  #To make it as quick as possible
                    }
                }
            ],
            "filter": # Whatever filter you may want to include here
        }
    }
}

Even when now it works, I still have a couple of questions:

  1. I have noticed that when I try to obtain the top k="10% of my data" (i.e., I try to obtain the k=100 search results, when my whole data has, say, 1000 samples) I start getting only 77 or so results... How to force the search results to obtain 100 results, even if the bottom ones are not that optimal?
  2. For a Vector Database of ~330K samples, I am getting response times of 2-3 seconds... Is this behavior normal? Can it be sped-up?

This was originally posted in the language-client category, but these last questions are more about vector search performance, so I'm redirecting this specifically to vector-search.

1 Like