Node startup failed after one disk fails

no_jihun · June 8, 2018, 2:06am

I have 6 data nodes and each nodes have 6disks with path.data[,,,] configuration.
One disk of a node failed and cluster got RED.

java.nio.file.FileSystemException: /data5/elastic_jpl/nodes/0/indices/IMwTL-ILTqKAC4Bsom7nzg/4/index/_9sbq_Lucene54_0.dvd: Read-only file system

At that time replica of the shard was 0, so data can not be recovered.
I had no idea and just restarted that node. then node goes shutdown, but can not started.

[2018-06-07T17:16:22,667][WARN ][o.e.i.e.Engine           ] [xelastic202.band] [jpl_denorm_band_20180316][0] failed to rollback writer on close
java.nio.file.DirectoryIteratorException: java.nio.file.FileSystemException: /data5/elastic_jpl/nodes/0/indices/Zx8rpMFuSOSv0EQP3XzoTg/0/index: Input/output error
        at sun.nio.fs.UnixDirectoryStream$UnixDirectoryIterator.readNextEntry(UnixDirectoryStream.java:172) ~[?:?]
        at sun.nio.fs.UnixDirectoryStream$UnixDirectoryIterator.hasNext(UnixDirectoryStream.java:201) ~[?:?]
        at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:216) ~[lucene-core-6.4.1.jar:6.4.1 72f75b2503fa0aa4f0aff76d439874feb923bb0e - jpountz - 2017-02-01 14:43:32]
        at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:234) ~[lucene-core-6.4.1.jar:6.4.1 72f75b2503fa0aa4f0aff76d439874feb923bb0e - jpountz - 2017-02-01 14:43:32]
        at org.apache.lucene.store.FilterDirectory.listAll(FilterDirectory.java:57) ~[lucene-core-6.4.1.jar:6.4.1 72f75b2503fa0aa4f0aff76d439874feb923bb0e - jpountz - 2017-02-01 14:43:32]

Questions.

One shard's data(segments) will be located in an single disk OR distributed across disks?
if data of a shard distributed across disks, more disks can make high shard failure possibility?
( because it's data can be distributed over more disks)
What can I do when failure disk having node not started?

es 5.2 using.

system · July 6, 2018, 2:07am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Frequent shard failures Elasticsearch	8	57	December 30, 2024
Nfs filesystem: DirectoryIteratorException No error information Elasticsearch docker	6	759	September 23, 2019
Elasticsearch Failure node lock error Elasticsearch	2	1743	July 26, 2019
Node failed to run Elasticsearch	4	452	May 21, 2022
Corrupted Shard on Recovery Elasticsearch	10	690	July 6, 2017

Node startup failed after one disk fails

Related topics