If you started 8 nodes with a single index in that configuration, data would
only exist on six of the nodes.
When considering how to lay out your data, you should think about what kind of
node configuration you expect to have, for example, if you have a maximum of
eight data nodes, then you should try to have a minimum of eight shards in the
cluster.
With your nodes, increasing the number of primary shards will increase the write
throughput, since the indexing operations are spread across more nodes.
On the other hand, increasing the number of replica shards will increase the
amount of nodes that can participate in read operations, assisting with read
performance.
In your case, doing 4 primary shards and 2 replicas (for a total of 12 shards)
is a fine solution, but of course, it depends on what sort of use case you are
using Elasticsearch for. I would recommend you start there (or something like 4
primary and 1 replica if you want to start even smaller).
When in doubt, the default settings (5 primaries and 1 replica) will go a very
long way.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.