Memory allocation against the index size

Hi all we are predicting our index size will be around 175 GB.
We came up with a cluster topology of having 2 data nodes and 1 master only node.

So each node data node will have 250GB of storage and 64 GB of ram allocating 30 GB per heap.

And the master only node will only have 5 GB of storage and 5 GB of ram.

  1. According to your expert opinion will this configuration work, or do we need to add more nodes?
  2. Will the default (5 shards 1 replica) setup will work for the index or do we need to add more shards?
  3. is there a way to calculate the amount of memory required per index size and number of CPU cores required per index size?

You should always aim to have 3 master-eligible nodes in a cluster so should set this up as onde dedicated master node and two master/data nodes.

The ideal shard size and shard count will depend on the data, queries as well as the number of concurrent queries (assuming this is a query-heavy use case). I would recommend running some tests/benchmarks to determine the optimal configuration for your use case. Have a look atthis Elastic{ON} talk for details.

This is also very closely tied to the data, queries and expected load so is best answered through benchmarking.

Thanks for the quick response. I'm sorry I was not clear on describing my topology. We have 3 master eligible nodes but only two of them are data nodes. The 3rd master only node will be only for the purpose of electing a new master and will be piggybacking on a different server.

Will it be a good bench mark to shrink the size of data and the resources in the cluster and do the benchmark and sort of like extrapolate the results. Or do we need to do benchmarking with the actual volume of data.

I saw on the internet that ideal shard size should be 20 - 40 GB is this true?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.