I wonder if there are good blog or page describes Elasticsearch performance comparison with SSD vs HDD.
I know Elasticsearch recommends SSD for better performance, but if there are comparison information in detail btw using SSD and HDD, it would be very helpful.
here is my use case. i will use them with xpack platinum license for anomaly detection.
using SSD is better for "searching" or "indexing" performance improvement? please advise.
I am not aware of any direct comparisons. SSDs generally provide far better disk I/O than even fast spinning disks, especially for random access loads. If you run a benchmark where you push Elasticsearch to the limit with a mixed indexing and query load, a cluster backed by SSDs will in my experience perform much better. It will depend a lot on the use case though.
A lot of use-cases do however not necessarily operate Elasticsearch at peak performance. For a lot of large scale use cases, especially where data has a long retention period, running clusters on spinning disks is perfectly fine. I would therefore recommend you run some realistic benchmarks to see which fit your use case best or provide some additional details about your use case to get more accurate estimates.
I am assuming that you want to use a significant portion of your storage. Spinning disks typically has a lot of storage but low I/O performance, which means that the will only support a relatively low indexing/query rate compared to SSDs. If you have a short retention period and want to use all the storage available, you may need to index more data every day than the spinning disks can support. This is why I recommend spinning disks for use cases with longer retention periods where a lot of the data sits idle on disk for long periods.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.