Questions of setting up elasticsearch server with 20 SSD on it


#1

hi, we are going to setup elasticsearch server with 20 SSD on it. but I am not sure how we gonna group those SSDs.

i read some material it says setup as RAID 0. Shall I group all 20 SSD into one file system or should separate each disk as one file system? our sysadmin says not suggest to group all SSD into one file system as one SSD dies it will affect the whole file system.

below are my questions if i setup separately as follows:
/data/0/
/data/1/
/data/2/
..
/data/20/

  1. how I should configure the path.data in elasticsearch conf?
  2. how to make sure the shards (especially replica) are distributed on different disk evenly instead of 1 disk?

#2

Hello;
Separate the list of directories with a comma, like
path.data: /data/0,/data/1,/data/3
There is obviously a risk of losing a whole file system after a single disk failure. However, splitting the storage up in to multiple smaller file systems brings a danger of one or more filling up quite soon possibly affecting multiple indices. I would probably rely on working replication and set up a single file system.
If a shard is being written in to a directory, everything of that shard will be stored in there, as I understand. You could achieve better IO distribution through striping, though.
Cheers,
wodenmo


#3

Thanks Wodenmo, my question on single file system is if one of the disk crash, everything on that file system including replica will all gone right?


(Nik Everett) #4

Elasticsearch won't allocate a replica to the same node as the primary ever, even if the node has multiple data paths.

RAID 0 with 20 disks is silly because the mean time between failure become too high. You sys admin is right. The RAID 0 recommendation for Elasticsearch makes sense with 2 or 3 disks only. 4 is probably too many for comfort.

I'd go with either a single RAID 6 volume (most disk space without exciting disk losses), a single RAID 10 volume (most IO for a single volume), or 10 RAID 0 volumes (most disk space but you still have exciting disk losses which is probably not acceptable with 20 disks) and list them all in data.path. If you've spent the money to buy 20 SSDs you really should spend the time trying out those layouts with the most real performance tests you can think of.


(system) #5