I'm going to install a new Elasticsearch 6.x cluster with hot and warm nodes.
- hot nodes have SSD disks
- warm nodes have HDD disks
Indices with no modifications for N hours/days will be moved from hot to warm nodes ("last update" date is computed by a custom script using multiple Elasticsearch statistics)
Multiple nodes will be installed on the same physical machine because the machines have a lot of RAM.
Each data disk will be :
- associated to only 1 data node to avoid concurrent accesses
- RAID 0 to avoid additional I/O due to parity/mirroring ; replica shards prevent data loss (primary/replica shards are allocated on disks that are located on different physical machines).
The OS is CentOS 7.
I'm looking for recommendations about the choice of filesystem and block size for data partitions but I don't manage to... I have found theses elements:
- https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html#_disks but it only applies to Elasticsearch 1.x to 2.x.
- Can I run multiple Elasticsearch nodes on the same machine? that recommends to dedicate disks per node and to configure them with RAID 0.
So, my question is :
- Do you have some recommendations about the choice of filesystem and block size for data partitions?
- Do these recommendations differ for hot/warm nodes? SSD/HDD disks?