As far as I can tell ES distributes segments over all data paths and if you
have reliable disks i.e. raid 0 etc. then this is a good policy but if you
are using single disks then failure of a single disk can affect all shards
on a node. I am pretty sure that ES can recover from such a failure but in
my case it means that I am going to go from a few TB that needs to be
copied to tens of TB.
Does anyone have any practical experience of disk failure and recovery?
Are there any settings to force all segments in a shard to be created in
the same data path?
I guess that I will need to restrict the number of disks per node and have
more nodes instead.
If you are using single disk machines, then all your segments will be
created in the one data path (ie system directory).
On linux with a package install, that's usually /var/lib/elasticsearch/
As far as I can tell ES distributes segments over all data paths and if
you have reliable disks i.e. raid 0 etc. then this is a good policy but if
you are using single disks then failure of a single disk can affect all
shards on a node. I am pretty sure that ES can recover from such a failure
but in my case it means that I am going to go from a few TB that needs to
be copied to tens of TB.
Does anyone have any practical experience of disk failure and recovery?
Are there any settings to force all segments in a shard to be created in
the same data path?
I guess that I will need to restrict the number of disks per node and have
more nodes instead.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.