RAID 0 vs multiple path.data in ES 2.4.0


(Avaneesh) #1

Hi everyone.
I have a 3-data-node ES cluster, hosted in the cloud (AWS). All three nodes have identical config and hardware resources. I am using three general purpose (GP2) SSDs in the following config on each box: 1TB + 300 GB + 75 GB = 1.375TB. path.data setting includes folders mounted on the three different drives.

The elasticsearch docs say, that using a RAID 0 (striped) configuration will give increased throughput, even on SSDs.

My question is, would moving from a config where I specify multiple path.data to a RAID 0 config make sense?
Does being on AWS make a difference?

Using ES 2.4.0. Usage is easy on indexing, but very, very heavy search loads. (thousands of QPS)

Thank you in advance!


(Mark Walkom) #2

Given the drives are of uneven size, moving to multiple paths won't be worth it.


(Avaneesh) #3

Thanks for the reply Mark.
If I change my disk arrangement to having 400G x 4 = 1.6T, would that change things?
Can you explain why the uneven size of drives make a difference?


(Mark Walkom) #4

Yes it will help.

ES stores all shards on a node across all paths evenly, therefore you are restricted to the smallest disk size.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.