Large number of nodes within a cluster in a local network

Hi,

In a local high speed network, are there any downsides or anything to be aware of when attempting to have a single elasticsearch cluster with 500 or 1000 or more nodes given that the servers where the nodes will be running are powerful machines (i.e. 20+ cores and 200+ RAM and equipped with SSDs)?

I just would like to know whether having hundreds or thousands of nodes within a cluster is totally not recommended or could be an acceptable scenario given the right circumstances.

Many thanks for your help,

What use case is this for?

Hi,

It is for time-based logs monitoring.

To give you a bit more context, what we have is a large number of existing servers on different local high speed networks on which there are logs generated by applications running on those servers. To give you rough sense of scale, each local network has around at least 1000+ servers each generating 20GB logs per day and we are aiming to keep at least 2 weeks worth.

Because the amount of logs is too big, we thought if we were to set up dedicated infra to run elasticsearch clusters it would be setting up too many new servers, perhaps hundreds so why not re-use the existing servers that we already have to host elasticsearch nodes as the existing servers are powerful but more importantly there are still quite a lot of free space on these servers - that free space when summed up together, we calculated, is more than enough to serve our purpose.

But I am concerned about issues that I might not know of when having too many nodes within a single cluster so was wondering whether it's ok to have a monolithic cluster with 1000 nodes in a local network or whether I should go with multiple smaller clusters inside the same local network.

Cheers,

Generally we'd recommend having clusters of less than 200 nodes, purely due to the overheads involved beyond that (allocation, reallocation, cluster state, management etc).

But you shouldn't need anything that large, you're only looking at about 300TB after all.

Thanks, I agree we shouldn't need anything that large.

We can't get new servers with large amount of disks very fast though so we're exploring the option to re-use existing infra but because existing infra servers each only have roughly 500GB free or less so we have to combine a lot of them together.

One more follow up question please - say now I have clusters of 100 nodes, and across all our networks we have 50 or more of those 100-node clusters, if I use tribe nodes to do federated searches across those 50+ clusters to give back combined results to users - would that be ok? Is there a rough limit of how many clusters tribe nodes can do federated searches against?

I haven't heard of anyone federating that many clusters, and there's a lot of "it depends" in answering that.

You'd need to trial it and see to be honest, I don't think we've ever tested that large.

Thanks, I will keep you updated.

Hi,

We decided that we will try to get dedicated new servers to host the Elasticsearch cluster.

From your point of view, very roughly, if you were to set up a cluster that should host logs, beats data coming from 10,000 other hosts, what would you consider the recommended amount of servers/nodes the cluster should have?

Assuming a node server spec would be something like this: 8+ cores, 64GB RAM, SSD.

Cheers,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.