Active_primary_shards > 1 when you have more than 1 data node

I have a 3 node Elasticsearch cluster. All nodes are data bearing nodes.

If active_primary_shards is set to 1 my data would not be distributed among the 3 nodes, is that a correct understanding? My single shard would be sitting on a single node. Wouldn't setting number of shards to 1 be defeating the purpose of having multiple nodes?

No, you could (for instance) set number_of_shards: 1 and number_of_replicas: 2 to have a copy of the shard on every node, all of which will respond to searches.

Thank you, I'm starting to understand better now.

Only after I start to hit data limits for a single shard, would increasing "number of shards" make sense.

If I have 150GB of data and a 3 node cluster, where all 3 nodes are data bearing, then it would make sense to set "number of shards" to 3, would you agree?

It depends™ :smile: Specifically the details of your data and how you are indexing & searching it will affect the answer here. But as a starting point number_of_shards: 3 on a 150GB index sounds reasonable to me.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.