This is how our current cluster setup on VMWARE:
3 master nodes: 8G RAM each, plus
2 data nodes: 8G RAM/ 500GB disk space, each on different physical disk for redundancy.
Recently we just started collecting NetFlow data with ElastiFlow, the data amount about is 25GB per day, with one
replication that's about 50GB disk space per day. The data retention period will be at least five days.
To accommodate the increasing data flow and amount, we are considering two scaling options here:
Add two more data nodes (8GB RAM, 500GB disk) to the cluster, and make sure each storage are on different physical disk. (Thanks for the shard allocation awareness).
The cluster will be 3 master nodes(8GB RAM) plus 4 data nodes(8GB RAM, 500GB disk each).
Add 8GB RAM, 500GB to each two existing data nodes.
The cluster will be 3 master nodes(8GB RAM) plus 2 data nodes(16GB RAM, 1TB disk each).
What are the pros and cons for each option? Any hints?