Distributed search, how to work?

Hello,
I read this document about distributed search execution:
https://www.elastic.co/guide/en/elasticsearch/guide/current/_query_phase.html
If query is broadcast to every shard in the index. Are there duplicate query phases on the same shard(primary and replicates)?

No, the search will only use one of the primary or replica.

OK,so the search scalability is the number of primary shard?

It's a number of things, but that is one part of it yes.

The usual way to scale up search is to add replicas, not primaries. The number_of_replicas setting of an index affects its capacity for search because each search of an index only goes to a single copy of each shard (i.e. either the primary or one of its replicas). If you add more replicas then each replica will serve proportionally fewer searches.

The number_of_shards setting of an index affects its capacity for both search and indexing. Each document that you index goes to every copy of its shard (i.e. the primary and all its replicas) so changing number_of_replicas doesn't affect the indexing load on each copy. However if you double the number_of_shards then you halve the load on each shard for both indexing and search.

The scaling is not perfect because there is overhead associated with adding extra shard copies. It doesn't work well to set these numbers to very high values in the hope that this gives you better performance. The only way to find the best performance is with careful benchmarking.

1 Like

Thanks for your replies.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.