Real world elastic sizing

Cooker · June 13, 2016, 9:29am

I have been using a single vm with the whole elk stack on one machine, 6 cores and 20Gb of ram and all is well.

I am looking to scale up the amount of data we store however the actual rate of messages will be static, just looking to keep months of data instead of days.

On this basis I believe the bottleneck is not going to be with logstash (Messages per second is not going up) but with elasticsearch.

I can scale vertically and just give the instance more CPU, disk IO/space and memory, (simple changes to do) or I could look at scaling out into multiple VMs - would give some redundancy but also wastes more disk space - probably dont want to do this unless I have to as we have no need for HA, messages can que during reboots or problems. Does anyone have any real world examples of large elastic instances? How big is too big?

magnusbaeck · June 13, 2016, 9:34am

Horizontal scaling isn't just for improving availability. Shards will be distributed evenly among the cluster's data nodes regardless of whether you have replicas.

JoarSvensson · June 13, 2016, 9:37am

By the nature of Elasticsearch, it's recommended to scale horizontally. Especially if you want to leverage the replication feature. As previously stated, in most (almost all) cases multiple nodes are the way to go for a production deployment.

Cooker · June 13, 2016, 9:43am

Sure, the issue I have is that any additional VMs will almost certainly be sharing the same hardware and physical disks. Is it still going to be best to scale that way? How horizontal or vertical should I go?

e.g. Lets imagine I have 80GB of ram, 24 cores and 10TB of disk to play with.

So I could go Huge:

1 VM with 24 cores, 80GB of ram and 10TB of disk

Big:

3 VMs with 8 cores, 25Gb of ram and 3.3TB of disk.

Medium:

10 Vms with 2 cores each, 8Gb of ram and 1TB of disk

or Micro:

20 Vms with 1 core each, 4GB of ram and 500Gb disk

Again, they will likely be sharing the same hardware.

Christian_Dahlqvist · June 13, 2016, 10:22am

If all nodes would end up on the same hardware anyway, it probably makes sense to use as few nodes as possible. If the hardware also needs to host Logstash, a single Elasticsearch node with around 64GB of RAM and 30GB heap may be the way to go.

Topic		Replies	Views
Scaling ElasticSearch Vertically Elasticsearch	4	1167	July 5, 2017
What bottleneck am I hitting?! Elasticsearch	13	4529	July 5, 2017
Scaling vertically? Logstash	2	313	March 25, 2018
Hardware configuration - tips Elasticsearch	10	1300	July 5, 2017
Suggestion on Elasticsearch scaling and performance for log management Elasticsearch	9	710	October 15, 2019

Real world elastic sizing

Related topics