We are working on a ELK Stack for Log Avalysis and pull reports.
Daily 20 GB(Raw Data) of logs from different source needs to be stored in Elastic search. With 3 years retention period the total log size would be 20 - 25 TB (Raw Data).
Following are the details which requires your team inputs
- What would be ideal cluster configuration (Number of node, CPU, RAM, Disk size for each node, etc) for storing the above mentioned volume of data in ElasticSearch?
- Is there any tunning required in Kibana to search and visulize the above mentioned volume of data?
- What's the Elasticsearch max retention period used in the industry now?
- Is there any way to compress the data before storing into ElasticSearch?
- Is there any migration tool available to load the data from exisitng system to Elasticsearch?
- Any back up system (Redis, HDFS and etc) need to be setup before loading the data into Elasticsearch?