Multiple paths in path.data stripes your shards across the paths (at the shard level).
Using RAID 0 stripes your shards across the disks (at the block level).
In the first case if you lose a disk, you lose the shards on that disk and nothing else.
In the second case, if you lose a disk, you lose all of your shards on that machine. Therefore the first gives you increased safety, but the trade off is lower performance.
As you said in the second case, if you lose a disk, you lose all of your shards on that machine. but what if i have replica on other machine with same configuration.
is elasticsearch balance primary and replica shard on this case?
let say i have 2 node with radi-0 configuration and multiple path set as i said above
node1-path.data: /mnt/md0, /mnt/md1
node-2-path.data: /mnt/md2, /mnt/md3
as you said if any disk on your first node failed then all shards will no longer usable because of RAID-0 configuration i agree on this but i have second node so does elasticsearch balnce the primary shards that are failed in node1 to node2 for fully functional cluster?
You will have primary and replica shards balanced in all paths. Even if you lose node1-path.data: /mnt/md0, /mnt/md1 That would have contained primary and replica shards, thus loss of data.
By default, Elasticsearch will not allocate the same index's primaries and replicas on the same node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.