My ela cluster failed to load metadata when running, but I can't see the other problem. If there is the same problem, please help to take a look

  • Here is the error message:
    {"@timestamp":"2023-02-23T08:43:40.767Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"elastic-cluster-0","elasticsearch.cluster.name":"elasticsearch-cluster","error.type":"org.elasticsearch.ElasticsearchException","error.message":"failed to load metadata","error.stack_trace":"org.elasticsearch.ElasticsearchException: failed to load metadata\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:161)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.node.Node.start(Node.java:1354)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.bootstrap.Elasticsearch.start(Elasticsearch.java:436)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:229)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)\nCaused by: org.apache.lucene.index.CorruptIndexException: Unexpected file read error while reading index. (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/_state/segments_3")))\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:301)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.IndexFileDeleter.(IndexFileDeleter.java:166)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.IndexWriter.(IndexWriter.java:1158)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.gateway.PersistedClusterStateService.createIndexWriter(PersistedClusterStateService.java:264)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.gateway.PersistedClusterStateService.createWriter(PersistedClust

erStateService.java:226)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.gateway.GatewayMetaState$LucenePersistedState.(GatewayMetaState.java:447)\n\tat org.elasticsearch.server@8.6.0/org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:130)\n\t... 4 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/_state/_1.si\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:181)\n\tat java.base/java.nio.channels.FileChannel.open(FileChannel.java:304)\n\tat java.base/java.nio.channels.FileChannel.open(FileChannel.java:363)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:78)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.store.Directory.openChecksumInput(Directory.java:156)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.codecs.lucene90.Lucene90SegmentInfoFormat.read(Lucene90SegmentInfoFormat.java:102)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.SegmentInfos.parseSegmentInfos(SegmentInfos.java:406)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:363)\n\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:299)\n\t... 11 more\n\tSuppressed: org.apache.lucene.index.CorruptIndexException: checksum passed (39cc8ebc). possibly transient resource issue, or a Lucene or JVM bug (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/usr/share/elasticsearch/data/_state/segments_3")))\n\t\tat org.apache.lucene.core@9.4.2/org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:500)\n\t\tat org.apache.lucene.core@9.4.2/org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:370)\n\t\

t... 12 more\n"}

There are three machines in the cluster, each of which is on, and I am confident that using the single node before the cluster is OK

Please don't only post an error message with no other information.

Dec 10th, 2022: [EN] Asking top notch technical questions to get you help quicker! has some guidance on what to provide.

oh sorry,my mistke

What version of Elasticsearch are you using?

elasticsearch image version is 8.6.0

A required file is missing:

Caused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/_state/_1.si

Elasticsearch didn't delete this file, so you have some problem with your storage or something else removed this file. If the rest of the cluster is healthy then the simplest fix is to wipe this node and reinstall from scratch, but you must work out what caused this file to go missing or else it will just happen again.

Thanks for your reply, David, because I am only installing elasticserach now, the entire cluster of nodes are reporting this error, and the other nodes are also unhealthy. My storage is the remote storage OSS provided by the preparator. I think it should be that the cluster node file is overwritten, which means that maybe only one node's file is written, and the other two nodes' files are directly overwritten, resulting in constant reporting of missing files.

The problem was solved. The pain point of the problem was that I did not fix path.data for each node, resulting in all the files of the node being written to the same directory.

By the way, I did not set up my elasticsearch cluster locally; I set it up using k8s

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.