CorruptIndexException after boot

Oozza · August 6, 2025, 12:00pm

Hi,
I should have done some snapshots, but I have to ask. Is it possible to recover from this error? ES doesn´t start. ES version is 7.17.24. One-node cluster. Thank you

[2025-08-06T10:25:59,213][ERROR][o.e.b.Bootstrap          ] [pythia] Exception
org.elasticsearch.ElasticsearchException: failed to bind service
        at org.elasticsearch.node.Node.<init>(Node.java:1089) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.node.Node.<init>(Node.java:309) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:234) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:169) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:160) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) [elasticsearch-cli-7.17.24.jar:7.17.24]
        at org.elasticsearch.cli.Command.main(Command.java:77) [elasticsearch-cli-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:125) [elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) [elasticsearch-7.17.24.jar:7.17.24]
Caused by: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=0 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/var/lib/elasticsearch/nodes/0/_state/_mem.si")))
        at org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:523) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:414) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:465) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.codecs.lucene86.Lucene86SegmentInfoFormat.read(Lucene86SegmentInfoFormat.java:143) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:03:50]
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:64) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:720) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:84) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:64) ~[lucene-core-8.11.3.jar:8.11.3 baa7c80af4278cc8951a344d8e9320386588d12d - houstonputman - 2024-02-05 15:02:58]
        at org.elasticsearch.gateway.PersistedClusterStateService.nodeMetadata(PersistedClusterStateService.java:306) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.env.NodeEnvironment.loadNodeMetadata(NodeEnvironment.java:459) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:356) ~[elasticsearch-7.17.24.jar:7.17.24]
        at org.elasticsearch.node.Node.<init>(Node.java:429) ~[elasticsearch-7.17.24.jar:7.17.24]
        ... 11 more

Tortoise · August 7, 2025, 3:32am

Hello @Oozza

As this is a 1 node cluster, maybe stop the elasticservice, try renaming the file shown in the error path="/var/lib/elasticsearch/nodes/0/_state/_mem.si" and then start the elasticsearch to see what happens. If this cluster is important you can backup the /var/lib/elasticsearch before making any changes.

Thanks!!

DavidTurner · August 7, 2025, 9:01am

Please don’t do this, or even suggest this sort of thing to other users. Modifying files within the Elasticsearch data path is pretty much guaranteed to make things worse.

That said, in this case it doesn’t really matter, there isn’t a way to recover from a missing/truncated .si file in the cluster state AFAIK. The file is required, so I expect renaming it will just cause a different error.

Did you experience a power outage or other hard system crash just before this problem arose? If so, the error suggests that your storage is illegally reordering writes across fsync() barriers, which is a common trick that some disks use to make things run faster (at the expense of correctness).

Oozza · August 7, 2025, 11:43am

Thanks for response. There was a disk error and it was remounted readonly

RainTown · August 7, 2025, 12:17pm

In the docs for elasticsearch-node is this text which implies all might not be lost:

However, if the disaster is serious enough then it may not be possible to recover from a recent snapshot either. Unfortunately in this case there is no way forward that does not risk data loss, but it may be possible to use the elasticsearch-node tool to construct a new cluster that contains some of the data from the failed cluster.

I've never had the situation described here, but might that not be worth a try for @Oozza ?

There is also:

Each node stores its data in the data directories defined by the path.data setting. This means that in a disaster you can also restart a node by moving its data directories to another host, presuming that those data directories can be recovered from the faulty host

Again, I've no specific experience here, maybe someone has written a howto on the precise steps to follow?, but if @Oozza can copy off his data to another server .... maybe some chances?

Tortoise · August 7, 2025, 1:24pm

Thanks for the heads-up.
I will definitely be more mindful when making suggestions in the future. I have tested it locally and found that once the file is corrupted, renaming it doesn’t help. Elasticsearch requires the last known good version to start up again. Appreciate the clarification and I will be more cautious going forward.

Thanks!!

Topic		Replies	Views
["org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: CorruptIndexException[codec footer mismatch (file truncated?) Elasticsearch	12	1284	December 21, 2023
Cant start elasticsearch FailedNodeException Elasticsearch	4	1902	April 29, 2017
Can not restart Elasticsearch service Elasticsearch	19	1180	June 29, 2023
Elasticsearch index folder contains corrupted file Elasticsearch	3	1762	January 27, 2020
ES not starting : After restarting elasticsearch 7.6.2 throws CorruptIndexException Elasticsearch	6	2196	June 5, 2020

CorruptIndexException after boot

Related topics