Earlier, we had only 1 index of storing all the data of the cluster of around 400GB. We migrated from that architecture to a new one that has monthly indices each of size around 30 GB (recommended index size in ES). After this migration, the search queries over multiple indices using the template-alias(search-alias-* or search-alias-read) is taking more time than the earlier architecture with one index. In the new architecture we are using 1 replica shard for each month data. we are storing data for past 2 years, so there are currently 25 indices. It was expected that new architecture will be more robust and flexible, am I missing something?
From Cluster Performance Metrics
Cluster 1: (One index - 400 GB)
SearchLatency: 35 ms (per index/shard)
SearchRate: 40 operation/min
ThreadpoolSearchThreads: ~0 threads
Cluster2: (Multi index - 25 indices each of 30 GB)
SearchLatency: 4 ms (per index/shard)
SearchRate: 500-1000 operation/min (Number of operations gets multiplied by number of shards on which searching is performed)
ThreadpoolSearchThreads: ~15-17 threads
I am not sure I am following here, as your provided information suggests that latency is lower, and search rate is higher for the multi index cluster. Not the other way around.
For the same ingestion rate,
When the query retrieves data from Cluster 1(single-index) it is taking less time and when it retrieves from Cluster 2(multi-index) it takes more time. The number of search operation in Cluster 2 got multiplied due to multiple indices present in Cluster 2 over which searching is perform thus increased the total search time of the query
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.