Scaling EL cluster with low/mid HW

nitram · November 18, 2015, 8:01pm

Hello guys,

we are about to deploy EL cluster to our older HW to utilize it and looking for some advices.
My current setup is 8x blade servers with internal storage and 88GB of RAM all together (will be distributed into blades), EL cluster to assume to process around 300mil of time series documents (logs) in total (20mil/month) in approx 160GB of total size. Indexing is not the main issue done in bulk from files, but mainly used for searches/aggregations (Kibana 4)

My plan is to distribute it as follows:
2x master nodes - holding no data, having each 8GB of RAM (16 GB total)
3x workhosre nodes - holding data, having each 16 GB of RAM (48GB total)
1x logstash server (+1 cold backup) - having installed logstash with processing of csv files and seding to EL cluster with 8GB of RAM

Is this setup OK? Any changes to be done?

And three teoretical questions:

What is better for searches/aggregations - 4 servers with 16GB RAM each or 2 servers with 32GB RAM each?
Its better to have workhorses the same amount of memory or it doesnt matter at all?
Master node not holding data should have more memory or leave it all to workhorses ?

Thanks
m.

warkolm · November 19, 2015, 12:19am

You should really have an uneven number of master nodes, so that when you set min masters it is always a majority (num masters / 2 + 1).

1 - depends on your searches and data size.
2 - Data nodes should definitely have the same amount of heap
3 - Dedicated masters should not require much memory, maybe 3 GB of a 4GB heap.

nitram · November 19, 2015, 11:41am

Yes I see, so not to have split brains
So for my case of dedicated master nodes I must have 3 dedicated master node which doesnt hold data, have low memory (4gb (2gb heap, 2gb left for OS)).
Then for data nodes I will have any amount of data nodes, since 3 dedicated master nodes are quite safe.

Since 3 dedicated data nodes with high amount of RAM (equal distributed) I will split indices monthly indices, all of them have 3 shards and 1 replica.
what do you think? good thinking ??

warkolm · November 20, 2015, 6:02am

For the number of nodes in your cluster, you are probably better off just adding another data node or two. Dedicated masters are nice but overkill here.

nitram · November 21, 2015, 12:39pm

So you would suggest not to have dedicated master nodes (to have more data nodes and some them have master functionaity) ?
I was also thinking of having more nodes with less memory, but i didnt know wheter is has any advantages.

warkolm · November 21, 2015, 11:04pm

For your current use case, yes.

nitram · November 23, 2015, 11:28am

I have reevaluated all the possibilities and concluded following:

3x dedicated master node (4-8gb RAM)
6x data nodes (16GB RAM (8GB heap, 8GB garbage collection+OS)

Therefore I will use monthly divided indices, divided into 6 shards (to have each shard on each node) and 2 replicas to make sure that will be resilient enough.
I will ude master nodes as client nodes and have logstash installed on one of them to process request and push it directly to data nodes for indexing.

My main focus are aggregations over the all logs and searches sometimes.

Any comments please?
Anyway @warkolm thank you for your help

nitram · November 27, 2015, 9:57am

Opinion?

Topic		Replies	Views
7 Node Cluster with 3 dedicated master and 4 data nodes Elasticsearch	12	1431	November 7, 2017
2 server cluster with failover Elasticsearch	2	886	July 6, 2017
Master Node vs. Data Node Architecture Elasticsearch	7	11385	July 6, 2017
Sizing master nodes Elasticsearch	6	1428	July 6, 2017
Server config for cluster Elasticsearch	2	425	January 12, 2020

Scaling EL cluster with low/mid HW

Related topics