Hi everyone, this is my first post here. I am planning to deploy a cluster which will get around 2.5 TB of data per day. These logs will be parsed locally on a few logstash instances and will be sent to AWS which will host the Elasticsearch instances. Right now, I have thought of the below configuration:
Its basically, 3xMaster, 4xData node of c4 and c2 xlarge, with 19 times 16.3 TB of SSD. We will be using S3 since the log retention period is 3 months. Our plan is to store one month Log in the SSD for evaluation and 2 months of raw Logs which comes to approx 80TB in S3 buckets, so that we can index them as and when we need using lambda. we will be using curator to delete older data.
I need to understand whether the above Hardware configuration would be fine? If not, what would be the requirement as for CPU and Memory.