I'm currently running an Elasticsearch cluster with 6 nodes, and (for every node) the disk usage is around 5.5 TB out of a total disk size of 20 TB. I don't anticipate a significant increase in data storage, and I'm wondering whether it would be better to have a proportional disk size of 10 TB instead of 20 TB for performance optimization.
I would also like to know the recommended disk usage per Elasticsearch node, as well as any best practices for optimizing performance based on disk size and other factors.
Any advice or insights would be greatly appreciated. Thank you in advance for your help!
ILM (index lifecycle management) is going to be your friend to make sure your shards don't get too big which can degrade performance. For example, you can setup ILM to use rollover index so your indices don't get to large and auto rollover based upon age, or size.