As our data has increased in size, we are having performance issues. I have been trying to determine the correct shard/node size, and I think we are way under powered. I have found some information on determining the correct allocation of resources, but they are so much bigger than what we are using that I wanted to get some verification.
We currently have a 7 node cluster, all machines running 8 cores and 56 g ram, with 28 allocated to the heap (verified using oob pointers). Our main index is 425g (161,133,495 docs), and we are running with only 5 shards. This is our first problem as each shard is around 90g (suggested max is 30g). We currently also have only one replica.
With shard target size of 70 GB, 7 data nodes will fit into the estimation.
Now, you could check on a test system whether there is a performance and resource consumption difference between 7 nodes each holding 3 shards of 50 GB or 7 nodes each holding 3 shards of 70 GB. This will mostly depend on queries (aggregations, filters).
If there is a difference, and 7 nodes can't handle the workload, go slightly for more nodes (9, 11, 13, ...) and do not forget to align the shard count with the number of nodes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.