CorruptIndexException after boot

Hi,
I should have done some snapshots, but I have to ask. Is it possible to recover from this error? ES doesn´t start. ES version is 7.17.24. One-node cluster. Thank you

[2025-08-06T10:25:59,213][ERROR][o.e.b.Bootstrap          ] [pythia] Exception
org.elasticsearch.ElasticsearchException: failed to bind service
        at org.elasticsearch.node.Node.<init>(Node.java:1089) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:169) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:160) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) [elasticsearch-cli-7.17.24.jar:7.17.24]
        at org.elasticsearch.cli.Command.main(Command.java:77) [elasticsearch-cli-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:125) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) [elasticsearch-7.17.24.jar:7.17.24]
Caused by: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=0 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/nodes/0/_state/_mem.si")))
        at org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:523) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:414) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:465) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.codecs.lucene86.Lucene86SegmentInfoFormat.read(Lucene86SegmentInfoFormat.java:143) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:03:50]
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:64) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:720) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:84) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:64) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.elasticsearch.gateway.PersistedClusterStateService.nodeMetadata(PersistedClusterStateService.java:306) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.env.NodeEnvironment.loadNodeMetadata(NodeEnvironment.java:459) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:356) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.node.Node.<init>(Node.java:429) ~[elasticsearch-7.17.24.jar:7.17.24]
        ... 11 more

Hello @Oozza

As this is a 1 node cluster, maybe stop the elasticservice, try renaming the file shown in the error path="/var/lib/elasticsearch/nodes/0/_state/_mem.si" and then start the elasticsearch to see what happens. If this cluster is important you can backup the /var/lib/elasticsearch before making any changes.

Thanks!!

Please don’t do this, or even suggest this sort of thing to other users. Modifying files within the Elasticsearch data path is pretty much guaranteed to make things worse.

That said, in this case it doesn’t really matter, there isn’t a way to recover from a missing/truncated .si file in the cluster state AFAIK. The file is required, so I expect renaming it will just cause a different error.

Did you experience a power outage or other hard system crash just before this problem arose? If so, the error suggests that your storage is illegally reordering writes across fsync() barriers, which is a common trick that some disks use to make things run faster (at the expense of correctness).

2 Likes

Thanks for response. There was a disk error and it was remounted readonly

In the docs for elasticsearch-node is this text which implies all might not be lost:

However, if the disaster is serious enough then it may not be possible to recover from a recent snapshot either. Unfortunately in this case there is no way forward that does not risk data loss, but it may be possible to use the elasticsearch-node tool to construct a new cluster that contains some of the data from the failed cluster.

I've never had the situation described here, but might that not be worth a try for @Oozza ?

There is also:

Each node stores its data in the data directories defined by the path.data setting. This means that in a disaster you can also restart a node by moving its data directories to another host, presuming that those data directories can be recovered from the faulty host

Again, I've no specific experience here, maybe someone has written a howto on the precise steps to follow?, but if @Oozza can copy off his data to another server .... maybe some chances?

Thanks for the heads-up.
I will definitely be more mindful when making suggestions in the future. I have tested it locally and found that once the file is corrupted, renaming it doesn’t help. Elasticsearch requires the last known good version to start up again. Appreciate the clarification and I will be more cautious going forward.

Thanks!!

1 Like