Elasticsearch cluster planning/sizing

I want to deploy Elasticsearch with ECK to enable search in a large production GitLab Cluster. I read that Elasticsearch cluster size should be 0.5 of the total of all repos(0.5 * 2TB). I ask about the ES cluster planning. Any rules of thumb about the initial cluster requirements for this scenario(also for K8S and ES)? (Nodes number/types/etc...) Of course - that's just before benchmarking, as a preliminary step.

This recommendation is not from Elastic, it is from Gitlab.

If you follow this recommendation you would need at least 1 TB in your data nodes, but I'm not sure if the recommendation from Gitlab takes replicas into account or not, if it does not take replicas into account, than you will need at least 2 TB in your data nodes to use replicas.

Regarding the Elasticsearch cluster, to have a resilient cluster you would need at least 3 master nodes, those nodes can also be data nodes, but it is recommended to have dedicated master, so with dedicated masters a minimal cluster would have 5 nodes, 3 dedicated nodes and 2 data nodes.

So, your initial cluster would have 5 nodes, depending on your resources you may just spin-up 2 data nodes with enough space for all your data, or start with smaller data nodes and add more data nodes if needed.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.