Performance searching single index vs multiple indices

Hi All,

How to understand this tip? From The Definitive Guide

Searching 1 index of 50 shards is exactly equivalent to searching 50 indices with 1 shard each: both search requests hit 50 shards.

I have 2048G of Data and 8 machines. From this tip, I can create 1 index with 70 shards or 10 indexes with 7 shards each index. Both are 70 shards.

If I have 2 search implementation. 1) For 1 index, I use one thread to search. 2) For 10 indexes, I use 10 threads, each thread search one index.

Which implementation has better performance? or they're exactly same?

Thanks.

In both cases you are hitting 70 shards, which will be processed in parallel on different threads, so they should basically be the same.

Thanks for your reply.

Another question: I have 8 machines for 2048G data. Which design is better? 1) 8 shards with each 256G, so each machine has one shard; 2) 70 shards, so each machine has about 8 shards.

I know there are 2 best practices: keep shard at 30G and one machine has one shard. In my case, I need the best performance. Which one is better?

Thanks.

Are you optimising for query latency or query throughput or a combination of the two? The optimal shard count will depend on your use case, hardware, data and queries, so I would recommend you run a benchmark to find out.

I'm optimizing for query latency.

Do you have a specific number of concurrent queries you are targeting?

Concurrent queries might be 100.

It is hard to reason about, so I am still recommending that you benchmarking it. Have a look at these talks for some guidance:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

https://www.elastic.co/elasticon/conf/2018/sf/the-seven-deadly-sins-of-elasticsearch-benchmarking

OK. Thanks very much for your reply @Christian_Dahlqvist

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.