I am building a new heavy throughput logging cluster, receiving about 150-200gb of logs a day acros ~60 indices (though mostly weighted towards 2 or 3 of them).
We currently have a setup with 5 data nodes with 4cpu's a piece and 15gb of memory. I'm trying to figure out what the performance considerations would be for that setup versus one with more, smaller machines (Ex. 10x2cpu and 7.5gb of memory)
Here are the ups and downs I can think of for each topology, based on my current understanding:
More Smaller Machines
- Assuming your shard count is increased to match the machine count, increased indexing speeds for the same volume of data.
- Larger cluster state for master nodes to maintain.
Fewer Larger Machines
- More memory per machine, so less risk of getting out-of-memory errors when trying to load a particularly large shard into memory. We do have some indices with primary shard sizes upwards of 10-20gb, so this is a pertinent consideration.
- More shards living on the same nodes, can lead to resource competition from many concurrent queries.
These are pretty basic, but are there any other significant things to consider about each approach??
I know topology questions are particular to every use case and that the best way to find out the best choice is through testing, which I plan to do, but I just want to make sure I understand which metrics I am stretching by pushing either approach.
More larger machines could also be an option.