Hardware recommendation for vector search

sandra_a · November 20, 2024, 7:02am

Hi;
At the moment I have a cluster with 8 nodes, 2TB RAM and 5TB SSD disk.
but after indexing vectors (1.2TB for now and will increase by time), search on it took 30 seconds.
I've tried quantization , add more nodes and add more shards and after that response time reduced to 10 seconds.
I even increase jvm heap which after that i noticed I shouldn't do that.
Now i faced with Hot-Warm-Cold architecture, I think It's not good idea to change data nodes to hot nodes, But seems it's the only way i have to try.
Is there any recommendation? increase jvm is right in this position? increase hardware resources will help? and how should i manage them right? or use hot architecture seems good?

Christian_Dahlqvist · November 20, 2024, 8:58am

Which version of Elasticsearch are you using? If not on the latest I would recommend upgrading as this area moves fast and is contunously being improved.

How many indices and shards is your data spread across? What is the average shard size?

Is that the RAM and storage per host or in total? Is the 1.2TB just the primary shard size or does it include replicas? If so, how many replicas?

A hot-warm-cold architecture is generally built on the assumption that newer data is queried more frequently and that it is acceptable with longer latencies when querying older data. If query requirements do not vary based on data age for your use case I would stay away from this type of architecture.

sandra_a · November 20, 2024, 1:37pm

@Christian_Dahlqvist thank you for your response.

It's 8.15.3

It's just one index, 25 shards with average size 40.

In total I have 2TB RAM. At this moment Per Host it's around 200 GB.
1.2TB just the primary shard. I didn't use replica.

So it's not a good idea because my data is not time-based.

Christian_Dahlqvist · November 21, 2024, 1:56pm

Have you looked at these guidelines?

sandra_a · November 25, 2024, 7:32am

@Christian_Dahlqvist

Thank you for your guidance. I reviewed these guidelines and implemented some of them.

From your last reply, I deduce that "partitioning" would be helpful. Is that correct? In general, is partitioning a good idea for large indices?

Christian_Dahlqvist · November 25, 2024, 7:45am

What do you mean by this?

sandra_a · November 25, 2024, 7:47am

split my vector index into two indices for example. is it wisely to do this for my vector index?

Christian_Dahlqvist · November 25, 2024, 7:51am

There are optimisations you can make based on the structure of your data and how you query it, but as you have not described anything about your data or query patterns it is impossible to say.

Why would this improve performance? How would the data be split?

As I know nothing about the structure of the data nor how it is queried I can not tell whether this makes sense or not.

You have also not told us anything about what your performance target is, just that 30 second latency is too high.

Topic		Replies	Views
Architecting cluster for fast searching Elasticsearch	3	407	January 7, 2019
Cold data node search performance Elasticsearch	9	2821	June 5, 2018
Elasticsearch performance in HDD vs SSD and 32 GB vs 64 GB of RAM Elasticsearch	25	2979	June 30, 2023
Hardware - Recommendations Elasticsearch	5	961	November 9, 2017
Elastic setup (slow queries) Elasticsearch	4	822	July 3, 2017

Hardware recommendation for vector search

Related topics