ILM vs routing for growing vector database with separate clients

Pablito77 · December 10, 2025, 3:49pm

Hello,
I use elasticsearch as a vector db for 1024 dim vectors. My initial setup would be around 20 million vectors, but it may increase to 100 milion vectors over time. However i always prefilter the vector search to a specific client_id - i have around 10k clients, they can have between 100 and +10k documents. The daily volume is about 300k writes and searches. I am wondering if it is better to use well known routing with static index of 20-30 shards or would you suggest to implement ILM policy with single shards and rollover after 30-40 GB. To sum up my conern is to implement ILM for growing index or routing to eliminate redundant shard searches. Thanks in advance for help!

Christian_Dahlqvist · December 10, 2025, 4:13pm

Do you just add new documents or do you also perform updates and/or deletes?

Do you have a specified retention period for your data set?

Pablito77 · December 10, 2025, 4:16pm

I add new documents on a daily basis and i also update them and delete - it bases on the user action in the system. Retention period is not an issue, because this is only a database for ML.

Christian_Dahlqvist · December 10, 2025, 4:20pm

In that case I think ILM and time-based indices seem to be a bad fit as it complicates updates and deletes and you do not want to use it to manage retention (which is it’s main purpose). I would go with a reasonably large number of primary shards together with routing based on the client ID.

Pablito77 · December 10, 2025, 5:52pm

Okay, Thank you for your opinion. I would probably do around 20 shards so they start with 5 GB and reach at most 30 GB.

Christian_Dahlqvist · December 10, 2025, 5:56pm

As your clients differ in size it is possible you will get an uneven shard size distribution, so it may be worthwhile going a bit higher on shard count for that reason. At least test it if you can.

Topic		Replies	Views
Is there a recommendation on the number of Indices that can be created using ILM Elasticsearch ilm-index-lifecycle-management	10	1311	March 20, 2023
Slow elasticsearch search performance Elasticsearch vector-search	5	740	December 24, 2023
Should I merge my many indices into 1 and use ILM? Elasticsearch	3	617	September 29, 2022
Using ILM for huge size of indexes Elasticsearch ilm-index-lifecycle-management	17	808	March 27, 2023
Shrink using ILM Elasticsearch	3	729	May 22, 2020

ILM vs routing for growing vector database with separate clients

Related topics