Slow ANN Hybrid search

Hi,
I have implemented vector hybrid search using ES dense_vector field and KNN option in the search API.

I have two index with vector field with 512 dimension embeddings (dot_product).

Each index have about 10 million documents.

I'm using ES v8.11.1 with 3 dedicated servers.
Each servers have 32G ram + 4 vCPUs.

It takes 10~30 seconds to get first results If I do hybrid search query.

Example query is like below

{
  "index": "catalogs",
  "body": {
    "knn": {
      "field": "vector_field1",
      "query_vector": [512 dimesion vector]
      "k": 20,
      "num_candidates": 100,
      "boost": 0.1,
      "filter": [
        {
          "bool": {
            "should": [
              {
                "match": {
                  "filter1": {
                    "query": "value1"
                  }
                }
              },
              {
                "bool": {
                  "must_not": [
                    {
                      "exists": {
                        "field": "filter1"
                      }
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must_not": {
              "exists": {
                "field": "filter2"
              }
            }
          }
        },
        {
          "term": {
            "filter3": true
          }
        },
        {
          "term": {
            "filter4": "value4"
          }
        }
      ]
    },
    "query": {
      "bool": {
        "boost": 0.9,
        "must": [
          {
            "multi_match": {
              "query": "text search query",
              "type": "best_fields",
              "analyzer": "standard",
              "fuzziness": "AUTO:5,8",
              "minimum_should_match": "-15%",
              "fields": [
                "filter5",
                "filter6"
              ]
            }
          }
        ],
        "filter": [
          {
            "bool": {
              "should": [
                {
                  "match": {
                    "filter1": {
                      "query": "value1"
                    }
                  }
                },
                {
                  "bool": {
                    "must_not": [
                      {
                        "exists": {
                          "field": "filter1"
                        }
                      }
                    ]
                  }
                }
              ]
            }
          },
          {
            "bool": {
              "must_not": {
                "exists": {
                  "field": "filter2"
                }
              }
            }
          },
          {
            "term": {
              "filter3": true
            }
          },
          {
            "term": {
              "filter4": "value4"
            }
          }
        ]
      }
    }
  },
  "size": 36,
  "from": 0,
  "_source": [
    "field1",
    "field2",
    "field3",
    "field4",
    "field5"
  ]
}
{
  "cluster_name": "search-prod",
  "status": "green",
  "timed_out": false,
  "number_of_nodes": 3,
  "number_of_data_nodes": 3,
  "active_primary_shards": 35,
  "active_shards": 71,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100
}
{
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "indices": {
    "catalogs": {
      "shards": {
        "0": [
          {
            "routing": {
              "state": "STARTED",
              "primary": true,
              "node": "QS9RJgx0TJK82_inigACWg"
            },
            "num_committed_segments": 34,
            "num_search_segments": 35,

I know that I have to reduce segment size in order to get better ann search performance, but because my index is updating every 30 minutes with bulk command, it is hard to maintain small count of segments.

Is there way to optimize performance in my situation?

Thanks for the help!

What is your search pattern? How often do you issue a search request? Are they issued after updates?
You said to get first results takes 10-30 seconds, what about following up searches, are they faster?
Do you have enough memory for your vectors? We have knn tuning guide that provides estimates how much memory you need to for good search performance.

Hi Mayya, thank you for the reply.

I do search request about once every 5~10 minutes.
And bulk indexing can be running at the same time.

If I search for the first in 5 minutes it takes about 20~30 seconds for the response,
and If I try search right after that, it takes 5~10 seconds for the response.

I have two indexes with 512 dimension vector(dot_product) and each index have about 10 million documents,

2 * 10,000,000 * 4 * (512 + 12) = 41.92 GB

So is it correct that I need 41.92GB of ram for file system cache except Java heap for each node?

Once again, thank you for the help!

2 * 10,000,000 * 4 * (512 + 12) = 41.92 GB
So is it correct that I need 41.92GB of ram for file system cache except Java heap for each node?

So for a single shard you need half of this: around 21Gb memory outside of Java heap just for vector search. By default Elasticsearch allocates half of the memory for heap, so you node should have a least 42 Gb total memory to serve this shard. Another shard with 10M vectors should be on another node.

If your node has only 32Gb RAM, you either need to allocate less memory for Java heap, or you need to break up you shard of 10M docs into multiple shards and host them on other nodes.

We also provide a memory optimized profile for storing vectors that is available for AWS Elastic Cloud users!

Thank you for the help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.