KNN Search super slow

Hi, I'm using ES v8.1.3 in a dockerized env with 4 shards. In my database I have around ~90M documents, and for each I have a dense vector with a dim of 768. But when I'm trying to perform a search using ANN (with cosine distance) the query takes more than 20 mins to complete. Have you any clue how to speed up this process ?

Two major things:

  • The KNN vectors and connections need to all be in memory.
  • If you can, normalize your vectors before ingesting and use dot_product instead of cosine.

Your vectors & connections will need 90_000_000 * (768 + 32) * 4 bytes for the vectors to sit in memory.

This will definitely require more than one node to adequately service as that is around 268gb of ram.

Here is our overall tuning guide: Tune approximate kNN search | Elasticsearch Guide [8.5] | Elastic

We are actively making storage of vectors much more efficient. Additionally, in 8.6, we will release support for byte-value vectors so that users can quantize their floats before ingesting for HUGE space savings at minimal accuracy loss.

Also, there are multiple bug fixes and improvements in 8.5. I would recommend actively working towards upgrading your elasticsearch cluster version.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.