Hi;
At the moment I have a cluster with 8 nodes, 2TB RAM and 5TB SSD disk.
but after indexing vectors (1.2TB for now and will increase by time), search on it took 30 seconds.
I've tried quantization , add more nodes and add more shards and after that response time reduced to 10 seconds.
I even increase jvm heap which after that i noticed I shouldn't do that.
Now i faced with Hot-Warm-Cold architecture, I think It's not good idea to change data nodes to hot nodes, But seems it's the only way i have to try.
Is there any recommendation? increase jvm is right in this position? increase hardware resources will help? and how should i manage them right? or use hot architecture seems good?
Which version of Elasticsearch are you using? If not on the latest I would recommend upgrading as this area moves fast and is contunously being improved.
How many indices and shards is your data spread across? What is the average shard size?
Is that the RAM and storage per host or in total? Is the 1.2TB just the primary shard size or does it include replicas? If so, how many replicas?
A hot-warm-cold architecture is generally built on the assumption that newer data is queried more frequently and that it is acceptable with longer latencies when querying older data. If query requirements do not vary based on data age for your use case I would stay away from this type of architecture.
It's 8.15.3
It's just one index, 25 shards with average size 40.
In total I have 2TB RAM. At this moment Per Host it's around 200 GB.
1.2TB just the primary shard. I didn't use replica.
So it's not a good idea because my data is not time-based.