["org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: CorruptIndexException[codec footer mismatch (file truncated?)

lins · November 21, 2023, 2:21pm

{"type": "server", "timestamp": "2023-11-21T09:52:52,412Z", "level": "ERROR", "component": "o.e.b.ElasticsearchUncaughtExceptionHandler", "cluster.name": "elasticsearch", "node.name": "elasticsearch-es-master-1", "message": "uncaught exception in thread [main]",
"stacktrace": ["org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: CorruptIndexException[codec footer mismatch (file truncated?): actual footer=0 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/nodes/0/_state/_1g1h.si")))];",
"at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:173) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:160) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-7.17.8.jar:7.17.8]",
"at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-7.17.8.jar:7.17.8]",
"at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:125) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-7.17.8.jar:7.17.8]",
"Caused by: org.elasticsearch.ElasticsearchException: failed to bind service",
"at org.elasticsearch.node.Node.(Node.java:1088) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.node.Node.(Node.java:309) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.bootstrap.Bootstrap$5.(Bootstrap.java:234) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:234) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:434) ~[elasticsearch-7.17.8.jar:7.17.8]",
"at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:169) ~[elasticsearch-7.17.8.jar:7.17.8]",
"... 6 more",
"Caused by: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=0 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/nodes/0/_state/_1g1h.si")))",
"at org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:523) ~[lucene-cor

DavidTurner · November 21, 2023, 6:41pm

lins · November 21, 2023, 11:41pm

hi,DavidTurner， we have 3 master nodes nodes and 9 workers. The power outage caused 2 master nodes to report the above error.

master0 error

elasticsearch Likely root cause: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=0 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/nodes/0/_state/_1j1o.si")))

master-1 error

elasticsearch ElasticsearchException[failed to bind service]; nested: CorruptIndexException[codec footer mismatch (file truncated?): actual footer=0 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/nodes/0/_state/_1g1h.si")))];                                                         │
│ elasticsearch Likely root cause: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=0 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/nodes/0/_state/_1g1h.si")))

lins · November 21, 2023, 11:41pm

DavidTurner · November 22, 2023, 6:32am

I think that's covered by the docs I linked above, particularly:

If a file is needed to recover an index after a restart then your storage system previously confirmed to Elasticsearch that this file was durably synced to disk. On Linux this means that the fsync() system call returned successfully. Elasticsearch sometimes reports that an index is corrupt because a file needed for recovery has been truncated or is missing its footer. This indicates that your storage system acknowledges durable writes incorrectly.

lins · November 23, 2023, 5:49am

Thank you. Is there any way to resolve the inconsistent footer crc verification and restore es cluster?

The master volume uses PV provided by Longhorn, and the power outage caused the Longhorn distributed system restart, as well as es which was also restarted due to the power outage in the server room.

DavidTurner · November 23, 2023, 6:16am

You'll need to ask the Longhorn folks if there's any way to restore the data it lost. If they say no then you'll need to restore the cluster from a recent snapshot.

lins · November 23, 2023, 6:46am

thanks @DavidTurner

lins · November 23, 2023, 7:17am

Hi David, Is it possible to repair the master node through the meta information of the data node?

DavidTurner · November 23, 2023, 7:39am

No, the cluster metadata is only stored on a majority (i.e. 2 of 3) of the master nodes.

lins · November 23, 2023, 8:05am

ok,thanks very much

DavidTurner · November 23, 2023, 8:23am

FWIW here are the relevant docs:

If the logs or the health report indicate that Elasticsearch can’t discover enough nodes to form a quorum, you must address the reasons preventing Elasticsearch from discovering the missing nodes. The missing nodes are needed to reconstruct the cluster metadata. Without the cluster metadata, the data in your cluster is meaningless. The cluster metadata is stored on a subset of the master-eligible nodes in the cluster. If a quorum can’t be discovered, the missing nodes were the ones holding the cluster metadata.

Ensure there are enough nodes running to form a quorum and that every node can communicate with every other node over the network. Elasticsearch will report additional details about network connectivity if the election problems persist for more than a few minutes. If you can’t start enough nodes to form a quorum, start a new cluster and restore data from a recent snapshot. Refer to Quorum-based decision making for more information.

system · December 21, 2023, 8:23am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Likely root cause: org.elasticsearch.gateway.CorruptStateException: codec footer mismatch (file truncated?): actual footer=-993495054 vs expected footer=-1071082520 Elasticsearch	8	3918	August 23, 2019
Cant start elasticsearch FailedNodeException Elasticsearch	4	1871	April 29, 2017
Elasticsearch Failing to start Elasticsearch	3	70	July 26, 2024
IndexOutofBound Exception and Merge Exception Elasticsearch	3	1077	July 5, 2017
org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?) Elasticsearch	3	3436	December 4, 2020

["org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: CorruptIndexException[codec footer mismatch (file truncated?)

Related topics