I'm confused about how many shards should I put in a node(64G mem and 32 processers with 500G disk). Is there a range recommended?Some project also calculate the number of shards by capacity,in their testing, a 20G's shard is most suitable. So if their total index is 1.5T, their is almost 77 shards in the cluster . Is that a right way to calculate?
I also have a problem about how to add shards when horizontal expansion. As our cluster only have one type, is it a good idea to add shards by adding index and use a index alias to point to all indices?
We don't recommend having shards over 50GB as it just makes reallocation harder than it should be. You can probably go larger if you only have a single node, but don't forget that to change the number of shards for an index, you need to reindex that entire index.
Your query speed will vary depending on shard size, so you will need to do some benchmarking in order to find the appropriate balance between size and speed for your dokuments and querying requirements.
Is that mean with the shard size grow ,the query speed will be more and more slow?
For Example: a 100G shards in a physical node is slower than three 33G shards in the same node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.