Managing Master Nodes and Data Nodes

Zay_Lin_Htun · November 5, 2022, 3:12am

Dear all,

I am new to ELK Security and I am trying to implement ELK with master nodes and data nodes to handle massive amounts of data but I can not find the good documentation about my use case how to deploy and configure about that.

I would like to request If you have any resources

leandrojmp · November 5, 2022, 11:15am

Can you provide more context on your use case and what are your doubts?

How much data are you talking about?

There are plenty of documentation about nodes and scaling elasticsearch.

Zay_Lin_Htun · November 5, 2022, 8:53pm

My daily logs ingest size is near 100gb per day and want to keep data for 30days and then I want to forward it to AWS S3 buckets for long term. I want to know how to prepare for my cluster and configuration for master and data nodes.

leandrojmp · November 6, 2022, 1:13am

The best way to find the right size for a cluster is to do a proof of concept with real data to get information about the event rate, daily volume, index speed, search speed etc.

But 100 GB per day is pretty small, just for example one of the recomendations from Elastic is to keep the shard sizes of your index around 50 GB.

With 100 GB/day and 30 days retention you will need something around 3 TB of usable space, to have some kind of resilience you need at least 3 master nodes, so the smallest cluster you may have if you want to have some resilience would be a 3 node cluster.

It is better to have dedicated master nodes, so you could have a cluster with 3 dedicated master nodes and 2 data nodes to be able to have replicas for your shards.

The AWS S3 parts you mean to have a backup of your indices? You can do that creating a repository for snapshots and if you need some old data you would be able to restore it from the snapshot.

stephenb · November 6, 2022, 6:32am

With Replica about 6 TB Total

You could probably easily run this on 3 Master / Data Nodes
8CPU / 64 GB RAM, 2.5 TB SSD / or HDD per Node... (preferable SSD)

But I totally agree with @leandrojmp POCing is really important because the equations above do not take into account the Query Side / Dashboards, Alerts, ML jobs etc (still think you will probably be fine if you have a normal case... BUT test first ... deploy once or twice only ...

Zay_Lin_Htun · November 6, 2022, 4:14pm

I got your point and thank you for your suggestions.
I would like to request one more for managing this cluster (3 master nodes and data nodes). How can I configure with only Public IP and how to receive the logs from Logstash.

In this cluster, which nodes is eligible to implement Kibana or need a dedicated instance for Kibana.

system · December 4, 2022, 4:14pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Setting up Multi-node Architecture of ELK for log monitoring Elasticsearch	6	686	June 10, 2019
Sizing ELK cluster for 12GB daily logs Elasticsearch	3	783	December 25, 2018
ElasticSearch Size Recommendation Elasticsearch	7	2695	April 28, 2017
5 node cluster setup for elk Elasticsearch	2	1589	May 25, 2019
Understanding of recommended ES Cluster Architecture Elasticsearch	6	442	September 3, 2020

Managing Master Nodes and Data Nodes

Related topics