Index corrupt exception

Hi,

We are seeing the Filesystem corruption issue. Due to this two shards are not accessible on the cluster. There is no issue from the hardware end and all other shards are accessible and available.

Error:

"store_exception" : {
          "type" : "corrupt_index_exception",
          "reason" : "failed engine (reason: [merge failed]) (resource=preexisting_corruption)",
          "caused_by" : {
            "type" : "i_o_exception",
            "reason" : "failed engine (reason: [merge failed])",
            "caused_by" : {
              "type" : "corrupt_index_exception",
              "reason" : """checksum failed (hardware problem?) : expected=c0axxx actual=540xxx (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/x/x/x/x")))"""
            }
          }
        }

While we try to execute the check index command getting the below error

Command:
java -cp lucene-core*.jar -ea:org.apache.lucene… org.apache.lucene.index.CheckIndex path

Error:

ERROR: could not read any segments file in directory
java.lang.IllegalArgumentException: Could not load codec 'Lucene84'.  Did you forget to add lucene-backward-codecs.jar?
        at org.apache.lucene.index.SegmentInfos.readCodec(SegmentInfos.java:449)
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:356)
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291)
        at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:522)
        at org.apache.lucene.index.CheckIndex.doCheck(CheckIndex.java:2962)
        at org.apache.lucene.index.CheckIndex.doMain(CheckIndex.java:2860)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2786)
        Suppressed: org.apache.lucene.index.CorruptIndexException: checksum passed (74xxxxx). possibly transient resource issue, or a Lucene or JVM bug (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/x/x/Segment")))
                at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:466)
                at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434)
                ... 5 more
Caused by: java.lang.IllegalArgumentException: An SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene84' does not exist.  You need to add the corresponding JAR file supporting this SPI to your classpath.  The current classpath supports the following names: [Lucene87]
        at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:116)
        at org.apache.lucene.codecs.Codec.forName(Codec.java:116)
        at org.apache.lucene.index.SegmentInfos.readCodec(SegmentInfos.java:445)
        ... 6 more

Can someone please suggest the next steps

... except for the reported checksum failure, which means the data on disk isn't the data Elasticsearch wrote. Either some other software has changed it or there is a hardware issue. It's rather common for there to be no other symptoms of a hardware issue apart from this kind of checksum failure.

I would suggest replacing the suspect disks and then restoring this index from a recent snapshot.

Hi David,

Is there a way to delete the corrupted records alone from the index

Thanks,
Jaya

No, you'll need to restore from a snapshot.

What type of storage are you using?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.