I was referring to following for sizing the shards,
I referred the following sections in the article, #Aim for shard sizes between 10GB and 50GB #Aim for 20 shards or fewer per GB of heap memory
I wanted to understand if we can scale Single Elasticsearch Node up to ~29.29 TB using the following configuration?
• Max JVM Heap size - 30 GB
• Shard per GB - 20
• Total Shards – 600
• Shard size – 50 GB
If possible, what all things needs to be taken into consideration as part effective performance management.
The numbers given are guidelines for maximum numbers. Each shard requires some overhead in terms of heap and you also need to have enough spare heap space to handle indexing requests and queries. Exactly how much data a node can handle will depend on your data, mappings, shard sizes and usage patterns. There is no guarantee that a node with 30GB heap can handle 600 shards of 50GB each.
Also note that query performance is likely to be affected by the amount of data each nodes hold and the type of storage used. If you have requirements on max query latency this may be what is limiting the data density.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.