ANN Search is super slow

spliter2157 · October 12, 2023, 2:37am

Hello There,

Hello, I have a question regarding Elasticsearch vector search. Our vector index has 768 dimensions, and it contains 25,000,000 documents split into two indices. Here are the results from /_cat/indices:

health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   kr-documents  8bCjsGAxQfC8CPPwnTKBCg  32   0    9587847       698158    182.8gb        182.8gb
green  open   us-documents  Es86NloYSVK7xMN7XyzPZg  32   0   17183616      3021606    680.7gb        680.7gb

Here is vector property

"vector": {
    "type": "dense_vector",
    "dims": 768,
    "index": true,
    "similarity": "cosine"
}

I've executed the Python code below for vector search, but even after waiting for more than a minute, the search results are not returned, and I receive a ConnectionTimedOut error. I need to retry 6-8 times to get the search results.

try:
    es_resp: dict = ES_MODULE.search(
        index=target_index,
        query=query,
        knn={
            "field": "vector",
            "query_vector": embedding,
            "k": kwargs.get("k", 200),  # It should be larger than 200
            "num_candidates": 200,
            "similarity": 0.8,
        },
        size=size,
        source=["patent_number", "country", "vector"],
    )
except ConnectionTimeout:
    LOGGER.warning(msg={"message": f"Connection Timed out(TRIED {retry_count} / 10)"})
else:
    break

I will also provide additional information related to the nodes. Each node has 32GB of RAM.

Server 1
- master
- data_content
- data_content
- coordinate_only
Server 2
- data_content

The reason for specifying the RAM of each node as 32GB is that I assumed the required RAM capacity is 72GB according to the formula below, and there are three data_content nodes. Therefore, I thought that each node would need 24GB.
num_vectors * 4 * (num_dimensions + 12) refer

If you could assist me with this issue, it would be greatly appreciated.

spliter2157 · October 12, 2023, 3:47am

Oh each nodes is operating on Docker Container

dadoonet · October 12, 2023, 6:37am

Which version is it?

spliter2157 · October 12, 2023, 7:01am

It's a very basic thing, but I forgot it. My apologies.
It's 8.8.1

Thijsvdp · October 12, 2023, 7:20am

I have been doing something similar lately. I have faced the same issues you are facing and found a solution to it by following the documentation on tuning for kNN performance.

A quick summary of what I am doing:

Roughly 90.000.000 docs,
Each docs has a Dense Vector of size 768,
5 nodes (32vCPU, 128GB each)

A couple of things that really made a difference:

Make sure you have sufficient RAM (Analyze index disk usage API | Elasticsearch Guide [8.10] | Elastic). Also make sure you have some spare RAM for other processes. Also your system by default uses 50% for heap I believe. So my guess is that your 24GB per shard exceeds this threshold.
Use forcemerge to reduce segments (I used 2 per shard). Merging segments really helped in query speeds! Also there are downsides to having few shards, so maybe finding some balance would be good here.
Preloading the kNN index into memory (Preloading data into the file system cache | Elasticsearch Guide [8.9] | Elastic). This really helped a lot as well! But make sure when you do this the index can fit into your RAM.
When you have sufficient RAM and have set up preloading the kNN index into memory, restart the cluster and rerun your experiment. Check IO on your cluster, when doing. When the kNN index does NOT fit into memory, you will see a high read IO.

There are a couple of challenges that lie ahead when you do these things, I noticed:

(Heavy) indexing into this index will increase segments again making your queries slow again! I have not found a proper way to deal with this.
Heavy operations on the index, like expensive queries or heavy indexing on the cluster (even in another index), may push out the kNN index from memory, making it terribly slow again! Yesterday I have started a thread on this: Manage heavy indexing in kNN indexes

Hope this helps a bit!

dadoonet · October 12, 2023, 8:34am

As there are a lot of performance improvements, could you upgrade to 8.10.3 and test that again?

spliter2157 · October 12, 2023, 9:09am

@dadoonet Thank you for your reply, i tried it. but still slow..

But, Increasing the RAM capacity to 64GB significantly alleviated the symptoms. It seems like @Thijsvdp hypothesis was correct! I will give it a try.
Appreciate That!

Thijsvdp · October 12, 2023, 10:34am

We are on 8.9.1, has there been a significant improvement in 8.10.3 vs 8.9.1 in terms of vector search?

Thijsvdp · October 12, 2023, 10:35am

What is the number of shards and the number of segments currently in your setup?

spliter2157 · October 12, 2023, 10:51am

I specified 32 shards per node, and I didn't specify segments separately. Just checked, and we have 657 segments in "us-documents" and 73 segments in "kr-documents."

dadoonet · October 12, 2023, 11:03am

I saw that in the release notes: What’s new in 8.10 | Elasticsearch Guide [8.10] | Elastic

Thijsvdp · October 12, 2023, 11:16am

Alright thanks. So you can still try to preload the kNN index, that may help a lot I think. And if you want to still speed up you may decrease segments, but this can have negative effects as described above.

That said, I also noticed you use cosine distance. I would recommend using dot product. If you make sure you insert normalized vectors, then they are equivalent. You would then avoid doing the normalization on each search request over and over again.

Thijsvdp · October 12, 2023, 11:18am

Oh that is great! I was actually waiting for this feature, but must have missed it. Thanks a lot!

Andrew_Mora · October 12, 2023, 12:53pm

spliter2157:

Hello There,

Hello, I have a question regarding Elasticsearch vector search. Our vector index has 768 dimensions, and it contains 25,000,000 documents split into two indices. Here are the results from /_cat/indices:
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   kr-documents  8bCjsGAxQfC8CPPwnTKBCg  32   0    9587847       698158    182.8gb        182.8gb
green  open   us-documents  Es86NloYSVK7xMN7XyzPZg  32   0   17183616      3021606    680.7gb        680.7gb
Here is vector property
"vector": {
    "type": "dense_vector",
    "dims": 768,
    "index": true,
    "similarity": "cosine"
}
I've executed the Python code below for vector search, but even after waiting for more than a minute, the search results are not returned, and I receive a ConnectionTimedOut error. I need to retry 6-8 times to get the search results.
try:
    es_resp: dict = ES_MODULE.search(
        index=target_index,
        query=query,
        knn={
            "field": "vector",
            "query_vector": embedding,
            "k": kwargs.get("k", 200),  # It should be larger than 200
            "num_candidates": 200,
            "similarity": 0.8,
        },
        size=size,
        source=["patent_number", "country", "vector"],
    )
except ConnectionTimeout:
    LOGGER.warning(msg={"message": f"Connection Timed out(TRIED {retry_count} / 10)"})
else:
    break
I will also provide additional information related to the nodes. Each node has 32GB of RAM.

Server 1

master

data_content

data_content

coordinate_only

Server 2

data_content

The reason for specifying the RAM of each node as 32GB is that I assumed the required RAM capacity is 72GB according to the formula below, and there are three data_content nodes. Therefore, I thought that each node would need 24GB.
num_vectors * 4 * (num_dimensions + 12) refer

If you could assist me with this issue, it would be greatly appreciated.

It seems like you're encountering ConnectionTimedOut errors in your Elasticsearch vector search. Given your data size and setup, optimizing the Elasticsearch cluster for improved performance, possibly increasing the RAM per node, and tweaking the timeout settings might help resolve this issue. AC Football Cases.

spliter2157 · October 25, 2023, 1:38pm

Setting up pre-load as you suggested was incredibly helpful. A query that used to take 2-3 minutes now executes in under 1 second. You saved my ass. Thank you

system · November 22, 2023, 1:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Slow aKNN search Elasticsearch vector-search	7	936	April 20, 2023
KNN Search super slow Elasticsearch docker , vector-search	3	1225	January 17, 2023
ANN Search: Performance / Setup Elasticsearch	6	970	July 26, 2022
Slow speed of ANN dense vector search using _knn_search Elasticsearch	8	2049	July 22, 2022
Tune elasticsearch for searching speed Elasticsearch vector-search	4	79	November 14, 2024

ANN Search is super slow

Related topics