As we started a using the ES before 2 years, that time, we have created
multiple clusters where each cluster with single index.
Current model:
Number Clusters: 10
Number of index in cluster: 1
Number of shards per cluster: 5
Number of Replica: 1
Per index size: 50 GB
Number of nodes per cluster: 3 nodes (master and data)
All these in 3 big severs.
As we use each nodes both as master and data nodes, the master gets high
loaded and if one master goes down, the master also goes down faster. The
end result is cluster becomes unavailable.
To resolve it, we are thinking following solutions:
Add more hardware, use dedicated master and data nodes, increase the
data nodes considering the future data growth and have less maintenance
impact if a cluster goes down. (or)
Merge the clusters like each cluster with 3 indexes with 3 dedicated
master and enough number data nodes say 8. So, per cluster it may 3 index
ie 15 primary and 15 replica shards balanced in 3 master and 8 to 10 data
nodes. (or)
Create one bigger cluster move all 10 index i.e. 50 primary and 50
replica shards, have the cluster with 3-5 master, 20 - 30 data nodes.
To have less down time and less management effort as well supporting future
data growth which one will be the better option to go?
ES scales horizontally, so you should consider one cluster of many nodes
and multiples indexes rather than many clusters. This will also save on
management overhead.
Some other points; Set shard count to an increment of node count, you have
3 nodes so use 3/6/9/etc shards, this ensures you have balanced allocation
and allows you to easily add more nodes and continue that even allocation
(if you use >3 shards obviously).
As we started a using the ES before 2 years, that time, we have created
multiple clusters where each cluster with single index.
Current model:
Number Clusters: 10
Number of index in cluster: 1
Number of shards per cluster: 5
Number of Replica: 1
Per index size: 50 GB
Number of nodes per cluster: 3 nodes (master and data)
All these in 3 big severs.
As we use each nodes both as master and data nodes, the master gets high
loaded and if one master goes down, the master also goes down faster. The
end result is cluster becomes unavailable.
To resolve it, we are thinking following solutions:
Add more hardware, use dedicated master and data nodes, increase the
data nodes considering the future data growth and have less maintenance
impact if a cluster goes down. (or)
Merge the clusters like each cluster with 3 indexes with 3 dedicated
master and enough number data nodes say 8. So, per cluster it may 3 index
ie 15 primary and 15 replica shards balanced in 3 master and 8 to 10 data
nodes. (or)
Create one bigger cluster move all 10 index i.e. 50 primary and 50
replica shards, have the cluster with 3-5 master, 20 - 30 data nodes.
To have less down time and less management effort as well supporting
future data growth which one will be the better option to go?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.