Is it possible to archive and analyze 1 year metricbeat in kibana


(saravanan) #1

Hi,

Can anyone help me? How to do archive and analyze the metricbeat data in kibana.

How to setup the infrastructure as well ??


(Mario Castro) #2

Hi @sarava :slight_smile:

You can follow this easy getting started guide of Elasticsearch to understand how to setup the infrastructure. As soon as you have available space on your disk you could store 1 year or 10.

Once you have elasticsearch runnning, you just need to connect Kibana to Elasticsearch to analize your data

For your reference, this is the getting started guide of Metricbeat.

Best regards!


(saravanan) #3

We are going to build new setup for analyse the one year data of linux and windows, so can i know the resource details for CPU, Memory and Disk space of the server.

single node and 3-node both if give means it helps lot...

Metricbeats count - 1500
Winlogbeats count - 1500


(Christian Dahlqvist) #4

Both Beats can generate varying amount of data depending on how they are configured and how much the host they are deployed on logs, so it is hard for anyone without more detailed knowledge about your deployment to estimate how much data that would be. I recommend setting it up on a few host and collect data over a few days to see how much it generates. That will give you a better idea of how much data the cluster will need to hold and will make it easier to get help estimating what the required number of nodes and hardware specification would be.


(saravanan) #5

i have mentioned 1500 host count only already we installed few host and receiving logs as well.

175 servers - 50GB data receiving.

May i know the resource details of Elasticsearch and Kibana for 1500 servers logs need to analyze.


(Christian Dahlqvist) #6

Over what time period did the 175 hosts generate this 50GB of data? How many indices and shards was this indexed into? Was this on a single node or a cluster with replicas configured? Did this include representative data from both types of beats?


(saravanan) #7

Only one day data of metricbeat is 50GB
only one indices and shards in indexed
3-node cluster with replicas configured


(Christian Dahlqvist) #8

How much data does Winlogbeat generate for the same period?


(saravanan) #9

Hi Christian,

Winlogbeat - 8GB - 25 servers


(Christian Dahlqvist) #10

OK. Based on that it seems like 1 host generates 50GB/175 + 8GB/25 = 605MB/day (this includes one replica shard). This is split almost even between Metricbeat and Winlog beat data. If we simply extrapolate on this we get 1500 hosts generating 909GB/day. Over a year that is close to 324TB of data stored.

If you have that amount of data coming in and want to keep it online for a full year, that will require a quite substantial cluster.

For Metricbeat data it may be possible to utilise rollups and reduce the volume significantly, but that feature is not yet GA.


(saravanan) #11

Thanks for your update.. I will discuss with my management and will get back to you..


(Christian Dahlqvist) #12

If you do not need the full year online and it is acceptable to restore data before querying older data you look into using snapshot and restore.


(saravanan) #13

Already i tried earlier the snapshot and restore. once done the data restoration then how make it to index status from red to green.

Is there any command to refresh the cluster index or how.


(Christian Dahlqvist) #14

Did you monitor the restore progress to make sure it had completed? Once done it will go from red to yellow or green depending whether any replicas need to be restored or not.


(saravanan) #15

it was completed once again i will do it then let you know the status.