org.apache.lucene.index.CorruptIndexException: checksum failed

I use ubuntu to run lucene(not ES), and nfs to mount disk(actually is rocketstor) of macmini. I get this error constantly. I have already used org.apache.lucene.index.CheckIndex, but will still crash after restart program.

Exception in thread "main" org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
        at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:724)
        at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:738)
        at org.apache.lucene.index.IndexWriter.numDocs(IndexWriter.java:1198)
        at xxxx.xxxxx.search.XxxxxxxxIndexer.close(XxxxxxxxIndexer.java:184)
        at xxxx.xxxxx.search.ThreadedXxxxxxxxIndexer.close(ThreadedXxxxxxxxIndexer.java:59)
        at xxxx.xxxxx.search.ThreadedXxxxxxxxIndexer.main(ThreadedXxxxxxxxIndexer.java:136)

Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=51fbdb5c actual=6e964d17 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/HPT8_56T/xxxxxxxxx-index/index1/_mq.cfs") [slice=_mq_Lucene50_0.pos]))
        at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:365)
        at org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:469)
        at org.apache.lucene.codecs.lucene50.Lucene50PostingsReader.checkIntegrity(Lucene50PostingsReader.java:1286)
        at org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.checkIntegrity(BlockTreeTermsReader.java:336)
        at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:317)
        at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96)
        at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:211)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:96)
        at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4099)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3679)
        at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
        at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)

Don't use NFS with Elasticsearch.

1 Like

I only use lucene, not ES, maybe nfs is the problem.

what's exactly the problem of nfs?

If you delete files over NFS, they may result in stale file handles. This is common while indexing and segment files are created and deleted.

If NFS is shared between two or more writing clients, Lucene might get confused regarding locking.
You can use a Lucene index only by one JVM.

You can not use memory mapped mmapfs store, only simplefs store. https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-store.html

Also, in general, NFS is very slow, if you store data, they must unmap from memory and sent over the NFS wire to the target file system before being written. The NFS read/writes are buffered on their own and can not make use of ES/Lucene buffer acceleration of the mmapfs store. This can be a factor of 5-10x slower.

@suiyuan2009 Perhaps you'd be better off asking in one of the Lucene mailing lists? These forums are mainly to support Elasticsearch or ES-related issues, so we can't really help you much with direct Lucene questions.

https://lucene.apache.org/core/discussion.html

Also echoing the sentiments to not use NFS :slight_smile:

thank you, I will use nfs noac option to see if it's the nfs cache problem

thanks