Hi,
We have an old cluster running at version 1.3.2 with 6 data nodes and 2 masters. Aim, the cluster is red with two shards in an initialising state and two unassigned:
However, I'm unable to determine which indices/shards have a problem because the cat/indices and cat/shards hang indefinitely. Can anyone advise on why this might be happening and how I can tackle it?
[2017-04-05 04:30:49,460][WARN ][index.merge.scheduler ] [node06] [index-2017.04.05][2] failed to merge
java.io.IOException: Input/output error: NIOFSIndexInput(path="/apps/elasticsearch/disk3/elasticsearch/nodes/0/indices/index-2017.04.05/2/index/_q8_es090_0.doc")
at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:186)
at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342)
at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
at org.apache.lucene.store.DataInput.readVInt(DataInput.java:126)
at org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:221)
at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.readVIntBlock(Lucene41PostingsReader.java:126)
at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.refillDocs(Lucene41PostingsReader.java:696)
at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextDoc(Lucene41PostingsReader.java:752)
at org.apache.lucene.codecs.MappingMultiDocsAndPositionsEnum.nextDoc(MappingMultiDocsAndPositionsEnum.java:104)
at org.apache.lucene.codecs.PostingsConsumer.merge(PostingsConsumer.java:109)
at org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:164)
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:399)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:112)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4163)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3759)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:106)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
Caused by: java.io.IOException: Input/output error
at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:699)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:684)
at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:176)
... 18 more
[2017-04-05 04:30:49,461][WARN ][index.engine.internal ] [node06] [index-2017.04.05][2] failed engine [merge exception]
[2017-04-05 04:30:49,970][WARN ][cluster.action.shard ] [node06] [index-2017.04.05][2] sending failed shard for [index-2017.04.05][2], node[K8i-aoFATluUWFunOB671w], [P], s[STARTED], indexUUID [5FNalKcQR2K5F9AeNOB0Ew], reason [engine failure, message [merge exception][MergeException[java.io.IOException: Input/output error: NIOFSIndexInput(path="/apps/elasticsearch/disk3/elasticsearch/nodes/0/indices/index-2017.04.05/2/index/_q8_es090_0.doc")]; nested: IOException[Input/output error: NIOFSIndexInput(path="/apps/elasticsearch/disk3/elasticsearch/nodes/0/indices/index-2017.04.05/2/index/_q8_es090_0.doc")]; nested: IOException[Input/output error]; ]]
I was hoping it was a disk space issue. From your df output it is not. Then with the sar i would be able to trace back to the historical info. Without it , it would be hard to say what was wrong with the I/O. Is disk 3 a NFS mount? Can you attach the dmesg file for the timeframe this problem occurred?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.