I'm currently running an Elasticsearch cluster with 6 nodes, and (for every node) the disk usage is around 5.5 TB out of a total disk size of 20 TB. I don't anticipate a significant increase in data storage, and I'm wondering whether it would be better to have a proportional disk size of 10 TB instead of 20 TB for performance optimization.
I would also like to know the recommended disk usage per Elasticsearch node, as well as any best practices for optimizing performance based on disk size and other factors.
Any advice or insights would be greatly appreciated. Thank you in advance for your help!
This really depends based upon your data and retention needs but here are some thoughts:
Generally, the size of the disk wouldn't affect performance. What would matter is spinning disk vs SSD, with SSD being faster and preferred.
You want to keep your nodes away from the cluster watermark limits, and with 5.5TB/20TB used you are doing great there.
ILM (index lifecycle management) is going to be your friend to make sure your shards don't get too big which can degrade performance. For example, you can setup ILM to use rollover index so your indices don't get to large and auto rollover based upon age, or size.
Finally, take a look the best practices for shard sizing. There also lots of good advice on that entire page as well.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.