RAID 0 vs multiple path.data in ES 2.4.0

Hi everyone.
I have a 3-data-node ES cluster, hosted in the cloud (AWS). All three nodes have identical config and hardware resources. I am using three general purpose (GP2) SSDs in the following config on each box: 1TB + 300 GB + 75 GB = 1.375TB. path.data setting includes folders mounted on the three different drives.

The elasticsearch docs say, that using a RAID 0 (striped) configuration will give increased throughput, even on SSDs.

My question is, would moving from a config where I specify multiple path.data to a RAID 0 config make sense?
Does being on AWS make a difference?

Using ES 2.4.0. Usage is easy on indexing, but very, very heavy search loads. (thousands of QPS)

Thank you in advance!

Given the drives are of uneven size, moving to multiple paths won't be worth it.

Thanks for the reply Mark.
If I change my disk arrangement to having 400G x 4 = 1.6T, would that change things?
Can you explain why the uneven size of drives make a difference?

Yes it will help.

ES stores all shards on a node across all paths evenly, therefore you are restricted to the smallest disk size.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.