KNN search speed

dendog1 · March 3, 2023, 5:07pm

Hi!

Today we are using ES mainly as a key / value store where most of our reads are just get by key.
We have recently started to use KNN, where we have:

Around 10MM docs.
384 dim vectors.
Using cosine sim as the metric.
Documents are large (webpage), but we retrieve just the URL & vector.

The performance is really bad, but only on the first query. So making one query the time for results could be around 20 seconds, the second query (if done within a few seconds) will take 10 secs, and if we keep making KNN queries rapidly, the time drops to sub second.

If we stop making such queries for a few minutes the next query is once again 20 seconds or so.

I would love to understand this behaviour to see if there is something which can be done to "warm up" this query type.

Thanks!
D

BenTrent · March 6, 2023, 1:42pm

@dendog1 have you read through: Tune approximate kNN search | Elasticsearch Guide [8.6] | Elastic

?

Some observations:

Being slow and then fast indicates to me that the vector index was out of memory and then added to memory
Becoming slow again shows that it is being kicked out of memory. Usually this indicates that you don't have enough ram to have the vector index and other structures you are using in the index.

So, is this index being used for other things? Are documents continually being added?

dendog1 · March 6, 2023, 2:56pm

Hey @BenTrent thank you so much for getting back to me!!

So yes I have read about tuning the search - those point are all noted and the options which we can do we have done.

On your observations, yes so this is the issue that I would like to understand - how to keep this index in memory.

This index at the moment:

98% of work is get by ID, we are using ES as a redis like cache here.
1.9% is inserts of new documents to this index.
0.1% are the random KNN search queries which are users run on this index and where we really need good performance!

Is there anyway I can force ES to keep this infrequent operation "hot"?

Thanks in advance!

dendog1 · March 7, 2023, 2:04pm

One finding I have is that sharding the index seems to be detrimental to speed, which seems rather odd as I though the sharding would allow parallel searches...

BenTrent · March 7, 2023, 3:03pm

It depends.

If the different shards still have 40+ segments, it wouldn't help much. If your different shards had much fewer segments, then I would expect some improvement.

dendog1 · March 7, 2023, 6:29pm

Thanks @BenTrent would you please be able to give some feedback about how one can force ES to keep this index and operation hot?

This index is accessed all the time, but the operations are very different.

BenTrent · March 7, 2023, 7:07pm

@dendog1

I will have to dig a bit more on thinking about how to keep KNN vector files in memory preferential to others.

Something to check are:

number of segments
Check Vector disk usage: Analyze index disk usage API | Elasticsearch Guide [master] | Elastic
If the vector disk usage is MORE than the off JVM-heap RAM it could be that things are getting kicked out.

It would be good to know how close to your vector index sizes are compared to your server's ram.

BenTrent · March 9, 2023, 12:32pm

@dendog1

One thing to try is: Preloading data into the file system cache | Elasticsearch Guide [8.6] | Elastic

The vector index file extensions are: vem, vex, and vec.

dendog1 · March 10, 2023, 11:12am

@BenTrent again huge thank you for your replies, I ran the disk usage check:

{
    "store_size": "103.9gb",
    "store_size_in_bytes": 111633278466,
    "all_fields": {
      "total": "103.8gb",
      "total_in_bytes": 111525920926,
      "inverted_index": {
        "total": "10.9gb",
        "total_in_bytes": 11704910019
      },
      "stored_fields": "78.3gb",
      "stored_fields_in_bytes": 84076073351,
      "doc_values": "2.5gb",
      "doc_values_in_bytes": 2709327298,
      "points": "293.3mb",
      "points_in_bytes": 307648946,
      "norms": "41mb",
      "norms_in_bytes": 43087459,
      "term_vectors": "0b",
      "term_vectors_in_bytes": 0,
      "knn_vectors": "11.8gb",
      "knn_vectors_in_bytes": 12684873853
    }

My machine size ram is 8gb, and the knn vectors are 11.8gb - does this mean we are at optimal performance of the current size of index vs cluster size?

Also our RAM does not even fit the index size in - is that an issue too?

i had a look the link for preloading data into file system cache, I can do this - but there is a warning there around the size, do you think it is still wise given the above? And finally on this point - I thought this would only make a difference for the first few requests, after that ES would load them into cache automatically - is that not correct?

BenTrent · March 10, 2023, 12:13pm

For kNN to work optimally, the entire graph and vectors need to be in memory.

So, that means you need at least around 12gb of ram (not including the ram used by the JVM).

In 8.5 we added support for 'byte' encoded vectors. So you can quantize your vectors to int8 to make the much smaller and maybe run just fine within your current hardware constraints

dendog1 · March 10, 2023, 12:55pm

Thanks @BenTrent that makes sense, and thank you for letting me know about the quantized vectors!

One more question, would it make sense for me to create another index which would be a sample of the full index, and it would contain just the vectors and some minor metadata. If I make such an index which would be much smaller - would I be able to force ES to hold this in memory even if it is not frequently accessed?

Thank you!

BenTrent · March 23, 2023, 2:38pm

@dendog1 this would only happen if those indices were on different nodes. A node only has so much off-heap memory and all shards on that node must share it.

system · April 20, 2023, 2:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Slow speed of ANN dense vector search using _knn_search Elasticsearch	8	1940	July 22, 2022
KNN Search super slow Elasticsearch docker , vector-search	3	1152	January 17, 2023
Slow aKNN search Elasticsearch vector-search	7	910	April 20, 2023
Indexing performance on indices with vector fields Elasticsearch vector-search	1	169	August 2, 2024
Tune elasticsearch for searching speed Elasticsearch vector-search	4	47	November 14, 2024

KNN search speed

Related topics