I have 6 data nodes and each nodes have 6disks with path.data[,,,] configuration.
One disk of a node failed and cluster got RED.
java.nio.file.FileSystemException: /data5/elastic_jpl/nodes/0/indices/IMwTL-ILTqKAC4Bsom7nzg/4/index/_9sbq_Lucene54_0.dvd: Read-only file system
At that time replica of the shard was 0, so data can not be recovered.
I had no idea and just restarted that node. then node goes shutdown, but can not started.
[2018-06-07T17:16:22,667][WARN ][o.e.i.e.Engine ] [xelastic202.band] [jpl_denorm_band_20180316][0] failed to rollback writer on close
java.nio.file.DirectoryIteratorException: java.nio.file.FileSystemException: /data5/elastic_jpl/nodes/0/indices/Zx8rpMFuSOSv0EQP3XzoTg/0/index: Input/output error
at sun.nio.fs.UnixDirectoryStream$UnixDirectoryIterator.readNextEntry(UnixDirectoryStream.java:172) ~[?:?]
at sun.nio.fs.UnixDirectoryStream$UnixDirectoryIterator.hasNext(UnixDirectoryStream.java:201) ~[?:?]
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:216) ~[lucene-core-6.4.1.jar:6.4.1 72f75b2503fa0aa4f0aff76d439874feb923bb0e - jpountz - 2017-02-01 14:43:32]
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:234) ~[lucene-core-6.4.1.jar:6.4.1 72f75b2503fa0aa4f0aff76d439874feb923bb0e - jpountz - 2017-02-01 14:43:32]
at org.apache.lucene.store.FilterDirectory.listAll(FilterDirectory.java:57) ~[lucene-core-6.4.1.jar:6.4.1 72f75b2503fa0aa4f0aff76d439874feb923bb0e - jpountz - 2017-02-01 14:43:32]
Questions.
- One shard's data(segments) will be located in an single disk OR distributed across disks?
- if data of a shard distributed across disks, more disks can make high shard failure possibility?
( because it's data can be distributed over more disks) - What can I do when failure disk having node not started?
es 5.2 using.