What should be my setup for elk if I want to keep the logs for 6 months?
My flow is syslog-ng -------kafka --------logstash---------elasticsearch --------kibana .
My per day data is 120GB.
I want to know the number of nodes I should use in elasticsearch/kibana/logstash and space/cpu/ram for those nodes?
There is not a default configuration of number of nodes.
To know your stimated space you must know how you want your data. If you want to have Hot-Warm-Cold archiqueture or only Hot.
Elastic recommend 1GB of RAM per 30 GB of data in Hot-Warn State, in Cold state recomend 1GB of RAM every 100GB of Data.
May i sugest 1 node to Logstash, 1 Master node, (HOT) 2 nodes Master-Data + 1 node Data, (Warm) 3 data nodes, (Cold) 2 nodes. If you only want to use Hot state it could be 1 Master node + 2 Master-Data nodes + 3 Data nodes. It could be good architecture.
The number of 3 data nodes in Hot and Warm is for the high availability of the data
Here some links of interest
current set up of elk for 3 month logs retention using hot phase of ilm policy
1node kibana
1node logstash
3nodes kafka and syslog-ng
3 master node elasticsearch
2 coordination node elasticsearch
6 data nodes elasticsearch
Now i want to keep logs for 6 months for 120 gb data per day with 6/7 data source as in indexes i have
You can set 2 of the 3 master nodes as master-data too, to improve the ingest, retencion and access to all data.
To the data machines how mani GB of RAM and Disc do you have? do you have configured de jvm.options to change the size of ram to the JVM
Actually this article from @Christian_Dahlqvist is still one of the best
This is for Elastic Cloud but the concepts and calculations are valid.
There are specific examples of calculations...
Also if you are building an on-prem cluster with Basic you can use the same HOT / Warm Profiles
Only difference we now recommend 160:1 for RAM / DISK on Warm nodes so
64GB RAM = ~10TB Disk
node | space(df-h) | version | cpu | RAM(Mebibytes/Gibibytes) | |
---|---|---|---|---|---|
kibana/Nginx | 100g | 7.16.2 | 8 | 62GI | |
syslog-ng/kafka | 100g | 2.11-2.2 | 4 | 15GI | |
syslog-ng/kafka | 100g | 2.11-2.2 | 4 | 15GI | |
syslog-ng/kafka | 100g | 2.11-2.2 | 4 | 15GI | |
logstash | 100g | 7.16.2 | 8 | 15GI | |
em1 | 50g | 7.16.2 | 4 | 15GI | |
em2 | 50g | 7.16.2 | 4 | 15GI | |
em3 | 50g | 7.16.2 | 8 | 62GI | |
ec1 | 50g | 7.16.2 | 4 | 15GI | |
ec2 | 50g | 7.16.2 | 4 | 15GI | |
ed4 | 1000g | 7.16.2 | 8 | 62GI | |
ed5 | 1000g | 7.16.2 | 8 | 62GI | |
ed6 | 1000g | 7.16.2 | 8 | 62GI | |
ed1 | 1000g | 7.16.2 | 8 | 62GI | |
ed2 | 1000g | 7.16.2 | 8 | 62GI | |
ed3 | 1000g | 7.16.2 | 8 | 62GI |
okay i will look into it. thankyou
how would this go for 120gb data per day retention for 6 months using hot phase?
total nodes = 16
data nodes = 7
coordination nodes = 2
master nodes = 3
logstash = 2
kibana = 2 one on standby
6 months retention of logs 120 gb per day * 180 days = 21600gb in 6 months
3 master nodes = 100gb each
2 coordination nodes = 100gb each
7 data nodes = 3TB each
3 kafka/syslog-ng = 3TB each 1 week retention policy
2 logstash - 1 tb each
2 kibana - 100 gb each
defined for all 6 data nodes in my current setup
node.roles: ["data_hot","data_content"]
You can increase the size of the data nodes a little, maybe 500 GB each to have room, the rest I think is fine.
If you see that Kibana, logstash or any of the other servises is slow you can increase RAM or CPU depends if is one or other saturated.
Given that you are setting up 2 or 3 of everything it looks like you are looking for some level of redundancy and high availability. If this is the case you probably want to have a replica shard for each primary shard, which will double the size of indices on disk. I would therefore double the amount of disk on each data node.
this 120 gb is primary + replica i have collected it from index management. i was planning this
total nodes = 16
data nodes = 7
coordination nodes = 2
master nodes = 3
logstash = 2
kibana = 2 one on standby
6 months retention of logs 120 gb per day * 180 days = 21600gb in 6 months
3 master nodes = 100gb each
2 coordination nodes = 100gb each
7 data nodes = 3TB each
3 kafka/syslog-ng = 3TB each 1 week retention policy
2 logstash - 1 tb each
2 kibana - 100 gb each
i just wanted to know the ram and cpu
As @Christian_Dahlqvist said if you want to have primary + replica you must double the space that you have to maintain the 160GB per day plus the replica that sums 320GB per day.
To calculate the ram use this formula 1GB of ram to 160 GB of data
The CPU at first glance depends on how many people connect simultaneously and all the processes you use, transforms, ilm, machine learning, etc. To determine this aspect you can see if the current CPU configuration works well or if you do the "top" or "htop" command (centos-linux) if you see that the load is very high you should expand the CPU.
but the 120gb is 60gb + 60 gb primary + replica.
Good to hear you have already accounted for that as it can make a major difference if missed.
Dedicated master nodes should not serve requests, so do usually not need a lot of CPU and RAM. 2CPU cores and 4GB RAM (2GB heap) might be a good starting point.
For data nodes the size will depend on the load they will be under. I would recommend a ratio of 1CPU core per 8GB of RAM as a good starting point. 32GB to 48GB RAM per data node might be a good starting point based on the disk size you specified.
Dedicated coordinating nodes are more difficult to size as it depends on how much query load they serve and whether they do handle ingest pipeline processing as well.
I do not have any recommendations around Logstash and Kibana nodes.
OKAY thankyou
space is also good?
3 master nodes = 100gb each
2 coordination nodes = 100gb each
7 data nodes = 3TB each
3 kafka/syslog-ng = 3TB each 1 week retention policy
2 logstash - 1 tb each
2 kibana - 100 gb each
I didn't get this - To calculate the ram use this formula 1GB of ram to 120GB of data?
How much data a node can handle will depend on the data, how you optimise indices and the load on the cluster. 120GB of data on disk per 1GB of RAM sounds a bit aggressive for a node that holds a lot of data and also handles indexing, so I was a bit more conservative.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.