Setting up ElasticCloud to handle large data - does adding new nodes just replicate or share data?

Hi all,

So we have a single node currently in ElasticCloud and I've been advised we need multiple nodes as we have a number of shards that are unable to be assigned due to the single node.

So, we have one node with 928gb storage capacity, and we are sitting at 82% utilization of that.

Being fairly new to taking the ES capabilities over from a previous developer I'm a little uncertain how to set this up, being both performance and cost concious.

As I understand it, if I add a second node, all that will do is replicate the first node - this I'll have two with 928gb capacity, both at 82% and then double the cost?

OR

If I create a second node - will that essentially 'distribute' the data, so if I kept the same size indices, it would be 928gb capacity, but both at 41% utilization, so in theory, I could have 2 smaller nodes, or even 3 even smaller nodes - to increase reliability - but without necessarily tripling the cost?

So yes and no. As u add more nodes you can distribute the primary shards across them. You can add 1 or more replicas to each shard for high availability that will also be distributed across the cluster. So it really depends on your setup.

Thank you for the reply @legoguy1000 !

I guess that's where I'm a little stuck. For the first instance, we simply created a single node - as described above. But we aren't in a position financially to just duplicate this, so we need to keep costs down - but still handle the amount of data we have.

I'm a little lost as to whether or not it is possible to spread the 700+gb into 2/3 smaller nodes to maintain a reasonable level of cost similar to the sized single node we have now.

If it is possible, then I'll keep researching to see how this is done from posts in the forum / documentaton.

So yes you could have 3 smaller nodes as opposed to 1 large node. Is that cheaper, does that improve performance, that I don't have a good answer for you. For performance it's going to depending on how much data you're ingesting, how fast, the kinds and number of searches.....

That is really encouraging, thank you. Totally understand you can't answer on those other aspects, but appreciate the knowledge that it's possible to distribute the total data amount into smaller nodes without just needing 2x the current node size.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.