RAID 0 vs multiple path.data in ES 2.4.0

avaneeshr · November 19, 2016, 9:01pm

Hi everyone.
I have a 3-data-node ES cluster, hosted in the cloud (AWS). All three nodes have identical config and hardware resources. I am using three general purpose (GP2) SSDs in the following config on each box: 1TB + 300 GB + 75 GB = 1.375TB. path.data setting includes folders mounted on the three different drives.

The elasticsearch docs say, that using a RAID 0 (striped) configuration will give increased throughput, even on SSDs.

My question is, would moving from a config where I specify multiple path.data to a RAID 0 config make sense?
Does being on AWS make a difference?

Using ES 2.4.0. Usage is easy on indexing, but very, very heavy search loads. (thousands of QPS)

Thank you in advance!

warkolm · November 20, 2016, 1:40am

Given the drives are of uneven size, moving to multiple paths won't be worth it.

avaneeshr · November 20, 2016, 6:51am

Thanks for the reply Mark.
If I change my disk arrangement to having 400G x 4 = 1.6T, would that change things?
Can you explain why the uneven size of drives make a difference?

warkolm · November 20, 2016, 7:29am

Yes it will help.

ES stores all shards on a node across all paths evenly, therefore you are restricted to the smallest disk size.

system · December 18, 2016, 7:30am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using RAID 0 vs multiple data paths after commit #10461 Elasticsearch	2	435	July 6, 2017
Raid 0 SSD? Elasticsearch	19	6472	July 5, 2017
To multi path.data or hardware RAID or not? Elasticsearch	4	2917	July 5, 2017
Using RAID 0 vs multiple data paths after commit #10461 Elasticsearch	6	3576	July 6, 2017
Elasticsearch multi path and RAID-0 Elasticsearch	6	1802	July 12, 2018

RAID 0 vs multiple path.data in ES 2.4.0

Related topics