Index Pre-Load for Vector Store

peedeeboy · March 7, 2025, 3:56pm

Hey all

We have dedicated ES clusters we use for our vector database and approx kNN searching (single hnsw_int8 dense vector) as part of our hybrid-search solution.

We've started to push one cluster pretty hard, and noticed before that if we scale a couple more nodes horizontally, performance can get slightly worse in our perf tests under normal load

My theory is this is probably us not getting the best out of caching with more nodes. I'm aware that vector search relies on the underlying Linux Page Cache and we've already made sure we've got plenty of RAM available on the nodes, so we've started looking into pre-loading the index files as described here

But we don't currently have any .vex or .veq files in our data folder

What we do have are:

.si
.cfe / .cfs
.dvd / .dvm
.fnm

So my question are:

Which of these would we be best to pre-load?
Am I right in saying our vectors are probably stored in the .cfe or .cfs files (which my Google-fu tells me are Lucene compound files)
If so, what triggers Lucene to make compound files? Is it default now? Is it possible that the index could have either .vex and .veq and or .cfe and .cfs? I.e. should we configure pre-load for all those file extensions?

Any guidance would be much appreciated!

Carlos_D · March 10, 2025, 9:53am

Hi @peedeeboy :

Compound files are normally used when segment sizes is less than 1GB. Is that your case?

You could preload compound files, but that's probably too much, more so with quantized vectors - so you won't need to prewarm the non-quantized vector values for example.

You can try to reduce the number of index segments so the compound file format is not used, as segments will be > 1GB in size.

peedeeboy · March 10, 2025, 10:04am

Thanks @Carlos_D - that's super helpful!

Yes, our segments will be pretty small. We have a trickle of indexing events throughout the day as products come in / out of stock, so lots of small segments getting written then merged by the background process...

Even when ES merges segments, we only have ~150K documents max, and because this is a dedicated kNN cluster, those documents are super basic (ID + vector embedding), so the merged segments are probably still pretty small - I have definitely seen .v* files in there in the past though!

So would I be correct in saying that theoretically when a segment merge happens if the merged segment is > 1GB, Lucene might decide to write it out as .vex + .veq instead?

If so, I think we'll do a quick perf test with pre-loading ['.cfe', '.cfs', '.vex', '.veq'] and see if makes a difference to us From what you say, I suspect possibly not...

Carlos_D · March 10, 2025, 10:32am

Even when ES merges segments, we only have ~150K documents max

At that volume, you might want to experiment with doing exact kNN search via script_score. You'll get better search results via exact kNN, you'll need to check that the latency is appropriate for your use case.

See this blog post for more details on approximate vs exact knn.

So would I be correct in saying that theoretically when a segment merge happens if the merged segment is > 1GB, Lucene might decide to write it out as .vex + .veq instead?

That is correct - the default merge policy for Elasticsearch uses 1GB as the limit for using compound files.

If so, I think we'll do a quick perf test with pre-loading ['.cfe', '.cfs', '.vex', '.veq'] and see if makes a difference to us From what you say, I suspect possibly not...

Keep in mind that you're searching over many small segments - that's going to make search slower, as knn needs to go over every segment for getting results. You should try to get less segments for the knn search side - adjusting the merge policy, or doing periodic force_merge merges could be used for that.

peedeeboy · March 10, 2025, 10:34am

Great stuff - thanks @Carlos_D that gives us some more things to look into to try and get the best out of our kNN searches

Carlos_D · March 10, 2025, 10:47am

Good luck! Please report back your findings @peedeeboy !

Topic		Replies	Views
Approximate KNN, Preloading & Performance Elasticsearch vector-search	5	278	February 28, 2024
KNN search speed Elasticsearch vector-search	12	2226	April 20, 2023
Vector search large dense vectors performance issues Elasticsearch vector-search	3	126	July 16, 2025
Slow aKNN search Elasticsearch vector-search	7	1013	April 20, 2023
Preloading all documents to filesystem cache Elasticsearch	3	643	September 29, 2022

Index Pre-Load for Vector Store

Related topics