ANN Search: Performance / Setup

SbstnErhrdt · June 14, 2022, 10:10am

I recently wrote this post to report some issues with the ANN Search / Set-Up. ANN Search Timeouts - #8 by Julie_Tibshirani

The main take-away for me was to use the:
"index.refresh_interval": "-1" setting and run a first request with source:false to get to an acceptable performance. Thanks again @mayya @Julie_Tibshirani

We added another index Index_d with more than 105 Mio documents with 768 vector dimensions. This index might grow to over 200 Mio documents.

So in total there are

Index a_cos: ~6 Mio
Index b_cos: ~10 Mio
Index c_cos: ~3 Mio
Index_d_cos: ~105 Mio

It currently takes 58 Minutes to conduct an ANN search.
That is way too long for our use-case.
My hypothesis is that the index does not fit in the RAM.

The setup is in a cloud environment where we currently have:

1 Node
- 8 VCPUs
- 128GBs of VRAM
- 4TB of SSD storage

So now my questions are:

What is a better set-up to come to an acceptable performance (req: ~1s)?
- Cluster Size?
- Node Size?
  - VCPUS
  - VRAM
  - SSD Storage
Are there additional tweaks regarding the performance?

Thank you so much.

mayya · June 14, 2022, 9:01pm

Thanks for reporting your use case.

58 minutes seems to be super long time for ANN search. Are you sure that this search is not blocked on indexing? Are you running these searches when all indexing is done and index is refreshed?

For the fastest searches we recommend to have enough RAM for all vectors to fit in. For example, if you have 200M vectors of 768 dims and each dims being float takes 4 bytes, comfortable RAM size should be at least: 4 * 768 * 200M = 740 Gb. That's really a lot. Several ways to address it:

distributed vector search across several machines
reduce number of dims. 768 is a lot of dims, is there a way to reduce them?
quantize vector values to lower precision (e.g. 8 bits instead of 32 bits). This is still work in progress on Lucene side, and currently not supported in Elasticsearch, but we aspire to have it.

SbstnErhrdt · June 17, 2022, 7:44am

Thanks @mayya for your answer.
Is there a blue print for optimal node set-up?

How large should the RAM be?
How many CPUs?

Thank you so much.

jalustig · June 23, 2022, 10:18pm

I also want to step in and say it would be very useful to have guidelines on the right node/cluster setup for efficient ANN search.

E.g. which are the most important resources, CPU, RAM, disk, number of nodes, which have the largest impact on ANN indexing and search.

mayya · June 27, 2022, 8:08pm

Thanks for the feedback. We will be working on developing these guidelines.
For now, just consider that for fast vector search we suggest to at least have enough RAM to hold your vectors (4 Bytes * number of dims * number of docs). And this RAM is outside of Java heap.

SbstnErhrdt · June 28, 2022, 10:31am

Thanks @mayya, for your reply.

I have one question regarding your statement:

Could you please elaborate a little bit more on what that means?

In my previous post, we detected that the search performs better if the java memory is reduced. The machine had 128GB, and we reduced it from the recommended half RAM 64GB via -Xms24g -Xmx24g to 24GB.
That configuration worked better.

Am I right in assuming that the HNSW implementation could then use more RAM and run faster?

I experimented with my setup to observe the behavior of the RAM with a reduced heap.
Using htop and I could not observe that additional RAM was used by HNSW.

If it's not using the JAVA Heap and I can not detect any changes in htop ... where is the structure stored?

Update: I observed that htop showed me a full RAM with yellow (except for the green Elasticsearch part). Yellow refers to disk cache. Am I right in assuming that this is the structure of HNSW which is store on disk, but is now cached in the RAM?

Thank you.

system · July 26, 2022, 10:31am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ANN Search is super slow Elasticsearch vector-search	15	1885	November 22, 2023
Speed of dense vector search with 512 or more dimensions Elasticsearch	8	3956	August 26, 2022
KNN Search super slow Elasticsearch docker , vector-search	3	1152	January 17, 2023
Slow speed of ANN dense vector search using _knn_search Elasticsearch	8	1940	July 22, 2022
Slow aKNN search Elasticsearch vector-search	7	910	April 20, 2023

ANN Search: Performance / Setup

Related topics