ANN Search: Performance / Setup

I recently wrote this post to report some issues with the ANN Search / Set-Up. ANN Search Timeouts - #8 by Julie_Tibshirani

The main take-away for me was to use the:
"index.refresh_interval": "-1" setting and run a first request with source:false to get to an acceptable performance. Thanks again @mayya @Julie_Tibshirani

We added another index Index_d with more than 105 Mio documents with 768 vector dimensions. This index might grow to over 200 Mio documents.

So in total there are

  • Index a_cos: ~6 Mio
  • Index b_cos: ~10 Mio
  • Index c_cos: ~3 Mio
  • Index_d_cos: ~105 Mio

It currently takes 58 Minutes to conduct an ANN search.
That is way too long for our use-case.
My hypothesis is that the index does not fit in the RAM.

The setup is in a cloud environment where we currently have:

  • 1 Node
    • 8 VCPUs
    • 128GBs of VRAM
    • 4TB of SSD storage

So now my questions are:

  • What is a better set-up to come to an acceptable performance (req: ~1s)?
    • Cluster Size?
    • Node Size?
      • VCPUS
      • VRAM
      • SSD Storage
  • Are there additional tweaks regarding the performance?

Thank you so much.

Thanks for reporting your use case.

58 minutes seems to be super long time for ANN search. Are you sure that this search is not blocked on indexing? Are you running these searches when all indexing is done and index is refreshed?

For the fastest searches we recommend to have enough RAM for all vectors to fit in. For example, if you have 200M vectors of 768 dims and each dims being float takes 4 bytes, comfortable RAM size should be at least: 4 * 768 * 200M = 740 Gb. That's really a lot. Several ways to address it:

  • distributed vector search across several machines
  • reduce number of dims. 768 is a lot of dims, is there a way to reduce them?
  • quantize vector values to lower precision (e.g. 8 bits instead of 32 bits). This is still work in progress on Lucene side, and currently not supported in Elasticsearch, but we aspire to have it.
1 Like

Thanks @mayya for your answer.
Is there a blue print for optimal node set-up?

  • How large should the RAM be?
  • How many CPUs?

Thank you so much.

I also want to step in and say it would be very useful to have guidelines on the right node/cluster setup for efficient ANN search.

E.g. which are the most important resources, CPU, RAM, disk, number of nodes, which have the largest impact on ANN indexing and search.

1 Like