Is there any recommended ratio between the number of master, data, coordinator and ingestion nodes?

Currently working with equal number of master, data and coordinator nodes (10). Need to add some ingestion nodes.

They all have 2 CPU/node, master and coordinator have 4 GB each, data has 16 GB.

I haven't done the sizing myself, I'm just taking over now.

Is there any recommendation for a ratio ingestion to data or ingestion to coordinator ?

How about in terms of resources ?

Any help would be greatly appreciated.
Thanks.

Can you provide more context about this? How many nodes you have, and what are their roles? It is not clear how master nodes you have and how many data nodes you have.

Also, coordinating-only and ingest-only nodes are barely required, normally only big clusters where there is some impact would need to have those nodes as dedicated.

Sorry, I meant 10 each. 10 master, 10 coordinator, 10 data.

You have 10 master nodes, 10 data nodes and 10 coordinator nodes with a total of 30 nodes in the cluster?

Can you run GET /_cat/nodes on Kibana Dev tools and share the response?

Yes, 30 nodes.

I'm not sure I am allowed to share such info and I'll have to see if I have access yet. Not sure if I mentioned, I'm new in the project, I wasn't the one who designed it, I'm generally new to sizing/hardware/infrastructure stuff.

I'm sorry for the lack of details, I wish I knew more myself.

The only information that could be seen as sensitive in the response of the GET /_cat/nodes request, is the IP address of the nodes, you can redact it.

It would help understand how your cluster is configured.

But in overall having 10 masters and 10 coordinating-only nodes looks like a waste of resources.

A production cluster needs at least 3 master nodes, and you will probably not need more than that for small/medium clusters, so you probably will be able to reduce the number of master-dedicated nodes to 3.

Also, coordinating nodes are rarely needed, specially in small clusters, if you have really 10 coordinating-only nodes you can probably remove them and point your clients (indexing and search) directly to the data nodes.

To give you more details from the few I know :slight_smile:

Expected traffic is about 100 million logs/day, with average size of 1 KB/log. Retention policy of 1 year.

I don't think it's a small or medium cluster :slight_smile:

100 million documents per of 1kB each is about 95GB raw data per day. In a year that is about 34TB. If we assume this is the amount of space it takes up on disk and that you have 1 replica shard that means in the region of 68TB of total storage. Sounds like a medium sized cluster.

With 100 million logs/day with an average dize of 1 KB, this leads to something around 100 GB per day,and with a retention of 1 year, this ends up to something close 36.5 TB and adding replicas for redundancy the total size would be close to 73 TB.

This is something of a medium size cluster, it is pretty similar in size to one that I manage.

The number of total nodes will depend on how you will organize your data, for example if you use a Hot/Warm architecture, you could have faster hot nodes with smaller disks and a little slower warm nodes with larger disks.

@leandrojmp @Christian_Dahlqvist

Seems like I underestimated the size of large clusters. Like I said, I don't have much experience with this, so thanks for the help, it's greatly appreciated.

Size is exactly in the middle of your estimations, it's about 70.5 GB.

Yes, we use hot/warm architecture, but the nodes are sized the same per type (all data nodes the same, all coordinating nodes the same).

Shards wise there are different numbers of shards/index per type of log and hot/warm/cold phase, but in total 80 indices and about 800 shards, with shard sizes between 15 and 70 GB.

@leandrojmp could I ask for the specs of the cluster you manage ?

Normally you would have hot nodes with more resources than warm nodes, I do not think you need coordinating only nodes in this case, specially 10 coordinating only nodes.

I have a similar cluster size, my hot nodes are also the ingest nodes, and all the clients only talk with the hot nodes, for this reason they are have 64 GB of RAM and 16 vCPU, the warm nodes are pretty smaller with 2 vCPU and 16 GB of RAM.

The disk size in this case is the same for both types of nodes, bot hot nodes are fast ssd and warm nodes are hdd backed.

But how many of each ?
(by the way, I added some more info in the previous message)

And if you don't mind me asking, how do you deploy them ? In the Helm charts I only have:

elastic:
    master:
        replicaCount: 10
...
    data:
        replicaCount: 10
...
    coordinator:
        replicaCount: 10

Of course with memory and cpu values for each.

But there is no distinction between hot, warm, cold.

I use traditional VMs, I do not use Kubernetes, so I cannot help with helm charts.

I made a mistake earlier, the total size is 70.5 TB, not GB, obviously :slight_smile:

@leandrojmp you mind me asking how many nodes of each you have (outside the 3 master nodes) ? If not exact number, maybe ball-park ?

Thank you and thank you for all the info you already provided.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.