Current we have total 26 nodes in an ES cluster, 3 masters, and 2 clients, 20 data node.
We have 16 cores,32GM mem, HDD raid5 4.6T, 1000Mbps NIC for each server of es data nodes.
We want to store all indexes among all data nodes In order to make better use of all server computing performance and save data for a long time.
My question is that what's the best sharding solution among the all 20 es data nodes? (also allow 1 or 2 data nodes can be down in terms of
network/hard etc failure)..
seems with 10 primaries and 1 replica, We just have a half-disk space utilization of whole es data nodes.
If we total have 100TB space within 20 ES nodes, which is we just can use 50TB for the primary shard, and rest of disk space for replicas. but the problem is that probably our indexes over 50 TB a year.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.