I'm at the point where I'm deploying a POC into a production environment. The architecture needs to be able to handle processing & indexing upwards of 5000 events per second and the ability to somewhat efficiently query up to 5 TB of data spanning events as far back as a year. I am ideally limited to four ES servers, which will also host Logstash instances. A fifth node as the external VM client node is my plan for visualizing the data via Kibana.
I know that in a properly-built Elastic stack, the storage is the bottleneck. Additionally, storage is the most-difficult to rightsize after the fact. Are there any recommendations for storage solutions that someone would recommend for my use case? I'll likely want 5TB per node, would consider SSD if it's worth the cost. I'm currently using two 15k 2TB HDDs per node, which is likely the bottleneck of my indexing & querying, which are the two metrics I'd like to improve upon migrating to a production deployment.
I also have read that 64GB RAM is the sweet spot, as well as 4 to 8 cores. I plan to implement 64GB & 16 cores per device, as I'm planning to run Logstash & Elasticsearch on the same nodes. If there are any other recommendations for hardware allocation in this aspect, I'm all-ears. Also, any recommendations or advice is welcome. Thanks!
edit: this will all be on the 5.0 Elastic Stack