RAID or JBOD

baytech · July 31, 2013, 7:27am

Currently don't have extreme through put requirements. Planning for a few
data nodes for indices. Since elasticsearch already allows for replicas -
what is the recommended RAID with SAS disks or SATA3 JBOD disks.
Backups will be stored separately offsite.

RAID with SAS would provide higher reliability and potentially higher read
performance but lower overall capacity per node
JBOD with Sata3 would give larger capacity per node, rely on elasticsearch
replication and offsite backups.

If one disk goes bad - RAID would rebuild on replacement disk
Not sure if one of the JBOD disks go bad if elasticsearch would rebuild all
the shards/replicas on entire node or just that disk. Basically how would
elasticsearch handle recovery if one of the disk goes bad.
Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Norberto_Meijome · July 31, 2013, 10:46am

i'd go RAID0 rather than JBOD, with a appropriate stripe size . if you lose
the raid, you can rebuild it from other shards / backups.

if one disk goes bad, ES would recover those shards lost only (after it's
done panicking like hell because its underlying storage has gone awol ...)

On Wed, Jul 31, 2013 at 5:27 PM, baytech samir.mukadam@gmail.com wrote:

Currently don't have extreme through put requirements. Planning for a few
data nodes for indices. Since elasticsearch already allows for replicas -
what is the recommended RAID with SAS disks or SATA3 JBOD disks.
Backups will be stored separately offsite.

RAID with SAS would provide higher reliability and potentially higher read
performance but lower overall capacity per node
JBOD with Sata3 would give larger capacity per node, rely on elasticsearch
replication and offsite backups.

If one disk goes bad - RAID would rebuild on replacement disk
Not sure if one of the JBOD disks go bad if elasticsearch would rebuild
all the shards/replicas on entire node or just that disk. Basically how
would elasticsearch handle recovery if one of the disk goes bad.
Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Norberto 'Beto' Meijome

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

baytech · July 31, 2013, 7:41pm

Thanks for the feedback.

Any comments about
using raided SAS 600GB drives vs SATA3 4TB drives
Besides capacity difference - are there significant differences in
reliability and access speeds.

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · July 31, 2013, 7:56pm

Elasticsearch has no deeper knowledge of the disk subsystem installed.
Currently it relies on Java 6 JVM methods and these largely ignore how the
operating system deals with disk and drive resources.

Because of this, Elasticsearch is clueless about disk performance and disk
failures. It will continue as long as the OS allows access to disks. There
is no automatic detection and repair built in ES.

If you encounter disk failures, you may be confronted with

high I/O load during automatic RAID rebuild, which may be not intended,
because a single failure can bog down the whole cluster
read-only filesystems - you may notice massive write IOExceptions in the
ES log or stalled ES JVMs
sudden reboots, depending on BIOS / OS fatal recovery strategy
stalled / locked-up / unavailable machines if the damage is exceptional
severe

If you demand a resilient system with unattended recovery, put some spare
drives into your RAID 1 (or 10), and trade failsafety for performance.
RAID5 is slow with writes, it is only interesting if you don't care for
maximum performance but for maximum space instead.

If you are ready to accept disk failures with putting down a whole machine
to fix them, you can go for RAID 0 which is supported by hardware for best
I/O performance (striped reads and writes in hardware). Add more replica
level in case you want to lower the risk of multiple node failures.

Common use pattern with JBOD is SW RAID with Linux mdadm and a single drive
in LVM. I do not prefer this configuration, because if a disk fails, you
may not be able to recover the SW RAID, no matter what RAID level. It's
easy to lose super blocks while wrestling with file system repair tools.

If you do not use SW RAID, you could use ES path striping across (JBOD)
partitions, but there are some shortcomings compared to RAID. Beside
lacking failsafety, it does not guarantuee to balance the disk space and
the disk I/O load evenly.

Jörg

On Wed, Jul 31, 2013 at 9:27 AM, baytech samir.mukadam@gmail.com wrote:

Currently don't have extreme through put requirements. Planning for a few
data nodes for indices. Since elasticsearch already allows for replicas -
what is the recommended RAID with SAS disks or SATA3 JBOD disks.
Backups will be stored separately offsite.

RAID with SAS would provide higher reliability and potentially higher read
performance but lower overall capacity per node
JBOD with Sata3 would give larger capacity per node, rely on elasticsearch
replication and offsite backups.

If one disk goes bad - RAID would rebuild on replacement disk
Not sure if one of the JBOD disks go bad if elasticsearch would rebuild
all the shards/replicas on entire node or just that disk. Basically how
would elasticsearch handle recovery if one of the disk goes bad.
Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · July 31, 2013, 7:58pm

Not an Elasticsearch question at all, so I have no answer. Maybe Tom's
Hardware can offer you information about these topics
http://www.tomshardware.com/charts/hard-drives-and-ssds,3.html

Jörg

On Wed, Jul 31, 2013 at 9:41 PM, baytech samir.mukadam@gmail.com wrote:

Thanks for the feedback.

Any comments about
using raided SAS 600GB drives vs SATA3 4TB drives
Besides capacity difference - are there significant differences in
reliability and access speeds.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.