What are the top scaling factors for growing into a large ES cluster? My
operations department has some concerns around how we'd grow the size of an
ES cluster to support hundreds or thousands of nodes. Does an ES cluster
require a non-blocking network? They define this to be a network such that
all nodes are linked by uniform throughput and latency. They see ES' rack
aware configuration and worry a little having run into problems with Hadoop
clusters in the past falling over as the fast talking intra-rack nodes are
able to saturate the inter-rack data channels causing the overall system to
hit a wall (suddenly). They've also been burned by many hundreds of nodes
saturating a network trying to keep state synchronized.
From what I can tell the master servers ought to keep the state chatter
Would someone in the know, or someone with experience with large ES
deployments mind describing what the important scaling factors of large ES
deployments are and at about what thresholds I'll be likely to hit them?