I filed a bug several days ago but have had no response, has no one else had issues after upgrading zlib to 1.2.12?
I tried this on a new clean data directory with a single node, if I downgrade to 1.2.11 the service starts fine and I can create indexes and index documents, when I use zlib 1.2.12 the service fails with errors
/usr/share/Elasticsearch/logs/Elasticsearch.log
[2022-04-05T05:06:08,301][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [gxdev1] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: org.elasticsearch.ElasticsearchException: failed to load metadata
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:170) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:157) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.common.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:81) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112) ~[elasticsearch-cli-8.1.0.jar:8.1.0]
at org.elasticsearch.cli.Command.main(Command.java:77) ~[elasticsearch-cli-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:122) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-8.1.0.jar:8.1.0]
Caused by: org.elasticsearch.ElasticsearchException: failed to load metadata
at org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:162) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.node.Node.start(Node.java:1142) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:272) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:367) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-8.1.0.jar:8.1.0]
... 6 more
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=226868ae actual=fcd3484d (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mq_cluster/data/elasticsearch/_state/_9c.fdt"))
at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:440) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.codecs.lucene90.Lucene90CompoundFormat.writeCompoundFile(Lucene90CompoundFormat.java:123) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.codecs.lucene90.Lucene90CompoundFormat.write(Lucene90CompoundFormat.java:98) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.IndexWriter.createCompoundFile(IndexWriter.java:5563) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.DocumentsWriterPerThread.sealFlushedSegment(DocumentsWriterPerThread.java:537) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:468) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:497) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:676) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:4014) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3988) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3967) ~[lucene-core-9.0.0.jar:9.0.0 0b18b3b965cedaf5eb129aa41243a44c83ca826d - jpountz - 2021-12-01 14:23:49]
at org.elasticsearch.gateway.PersistedClusterStateService$MetadataIndexWriter.flush(PersistedClusterStateService.java:692) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.addMetadata(PersistedClusterStateService.java:991) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.overwriteMetadata(PersistedClusterStateService.java:975) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.writeFullStateAndCommit(PersistedClusterStateService.java:788) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.gateway.GatewayMetaState$LucenePersistedState.<init>(GatewayMetaState.java:450) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.gateway.GatewayMetaState.start(GatewayMetaState.java:131) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.node.Node.start(Node.java:1142) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:272) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:367) ~[elasticsearch-8.1.0.jar:8.1.0]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:166) ~[elasticsearch-8.1.0.jar:8.1.0]
... 6 more
[2022-04-05T05:06:08,309][INFO ][o.e.n.Node ] [gxdev1] stopping ...
[2022-04-05T05:06:08,353][INFO ][o.e.n.Node ] [gxdev1] stopped
[2022-04-05T05:06:08,354][INFO ][o.e.n.Node ] [gxdev1] closing ...
[2022-04-05T05:06:08,369][INFO ][o.e.n.Node ] [gxdev1] closed
[2022-04-05T05:06:08,371][INFO ][o.e.x.m.p.NativeController] [gxdev1] Native controller process has stopped - no new native processes can be started
While it's true that they should definitely move away from 7.1.2, and also Arch isn't supported, the OP indicates that this is a problem in 8.1.0 too. Latest zlib does include some changes in how CRCs are calculated which could be having an impact here, although I haven't been able to reproduce the failure myself.
this was resolved with help from bug report on github DaveCTurner pointed out it is a CPU specific issue, in proxmox change cpu (from kvm64) to Haswell
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.