If using Elasticsearch as a vector database, and considering that many knowledge bases employ different embedding models, over time this can result in a large number of small indices, for example, tens of thousands. Could this cause any issues in Elasticsearch? If so, what are the recommended best practices to handle such a scenario?
Hi @noday ,
While Elasticsearch can handle multiple indices, tens of thousands of small vector indices will likely degrade performance and cluster stability due to overhead and merge costs.
The best approach is to consolidate indices sensibly, optimize shard count, use index aliases, leverage lifecycle management, and scale out the cluster whenever needed to maintain performance and scalability.