["org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?)

We are facing the CorruptIndexException and also UnavailableShardsException in elastic log.

In our system for all index the number of shards is 5. While QA testing we are facing the index elated exception below are the exception details from elastic log.

CorruptIndexException

{"type": "server", "timestamp": "2024-02-19T06:22:38,273Z", "level": "WARN", "component": "o.e.i.e.Engine", "cluster.name": "docker-cluster", "node.name": "be68e90566ac", "message": " [response][3] failed engine [already closed by tragic event on the index writer]", "cluster.uuid": "TR2USzpBSpKEAxHTF_4pmQ", "node.id": "A4FZJpLpRoyLJN_JIx51Dg" ,
"stacktrace": ["org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=ef33e57f actual=12a3e5e6 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/m3PsnRxeR7WAykR6qJhGxg/3/index/_tt2_Lucene84_0.tim")))",
"at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:547) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.checkIntegrity(BlockTreeTermsReader.java:349) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:371) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.elasticsearch.index.engine.PrunePostingsMergePolicy$2$1.checkIntegrity(PrunePostingsMergePolicy.java:73) ~[elasticsearch-7.17.5.jar:7.17.5]",
"at org.apache.lucene.codecs.perfield.PerFieldMergeState$FilterFieldsProducer.checkIntegrity(PerFieldMergeState.java:271) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]",
"at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94) ~[elasticsearch-7.17.5.jar:7.17.5]",
"at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684) ~[lucene-core-8.11.1.jar:8.11.1 0b002b11819df70783e83ef36b42ed1223c14b50 - janhoy - 2021-12-14 13:46:43]"] }

UnavailableShardsException

{"type": "server", "timestamp": "2024-02-19T06:23:38,920Z", "level": "WARN", "component": "r.suppressed", "cluster.name": "docker-cluster", "node.name": "be68e90566ac", "message": "path: /response/_doc/response_2024-02-19T15:22:38.806+09:00ce86dbfe-b75e-4490-85de-bbf509615eab, params: {index=response, id=response_2024-02-19T15:22:38.806+09:00ce86dbfe-b75e-4490-85de-bbf509615eab, timeout=1m}", "cluster.uuid": "TR2USzpBSpKEAxHTF_4pmQ", "node.id": "A4FZJpLpRoyLJN_JIx51Dg" ,
"stacktrace": ["org.elasticsearch.action.UnavailableShardsException: [response][3] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[response][3]] containing [index {[response][_doc][response_2024-02-19T15:22:38.806+09:00ce86dbfe-b75e-4490-85de-bbf509615eab], source[{"_class":"com.dfi.infinity..rest.response.elastic.ResponseElasticEntity","id":"response_2024-02-19T15:22:38.806+09:00ce86dbfe-b75e-4490-85de-bbf509615eab","correlationIdString":"API-REF:RA433/190224/1100/200224/0100-RTE:KTM/NRT-CAR:ABC01","timeStamp":"2024-02-19T15:22:38.275Z","surName":"MEKAYLA","givenName":"WINIFRED","otherName":"KIARRAH","journeyReference":"RA433/190224/1100","hitachiSendDateTime":"2024-02-19T15:22:38.805Z","numOfPax":1,"responseStatus":"","aboReference":"19022024152120RTRU","avfReference":"8IO8AKEYUB","requestJsonMsgContentId":"JSON_2024-02-19T15:22:38.764+09:00fc9e9add-03cf-4fcf-9b05-e0fd8df0fc44","messageId":"msg_2024-02-19T15:22:38.379+09:00e3df98fd-e3a6-4fb5-a4d7-9bf831df8808","isCusresSend":false,"createdDateTime":"2024-02-19T15:22:38.805Z","transportId":"mqtr_2023-11-10T20:14:59.761+09:0088710e07-7802-4f25-9a1e-63a246c59e56","operatingAirline":"RA","apiCloseOutTransmission":false,"dateOfBirth":"1988-03-07","nationality":"EST","docType":"P","docTypeInFullValue":"Passport","travelDocNumber":"50365745","travelDocExpiry":"2027-06-02","gender":"F","genderInFullValue":"Female","passengerType":"FL","passengerTypeInFullValue":"Passenger","apiCancelledReservationTransmission":false,"boardingStatus":"PENDING","messageType":"","isParent":false,"msgReceivedHitachiSendDiff":530,"borderCode":"NRT","borderName":"Narita International Airport","errorCount":0,"text":"API-REF:RA433/190224/1100/200224/0100-RTE:KTM/NRT-CAR:ABC01 19/02/2024 15:22 WINIFRED MEKAYLA KIARRAH RA433/190224/1100 19/02/2024 15:22:38 1 RA 07/03/1988 EST P Passport 50365745 02/06/2027 F Female FL Passenger PENDING false 530 NRT Narita International Airport"}]}]]",
"at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:1077) [elasticsearch-7.17.5.jar:7.17.5]",
"at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:873) [elasticsearch-7.17.5.jar:7.17.5]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.17.5.jar:7.17.5]",
"at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:1032) [elasticsearch-7.17.5.jar:7.17.5]",
"at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:345) [elasticsearch-7.17.5.jar:7.17.5]",
"at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:263) [elasticsearch-7.17.5.jar:7.17.5]",
"at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:660) [elasticsearch-7.17.5.jar:7.17.5]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:718) [elasticsearch-7.17.5.jar:7.17.5]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]",
"at java.lang.Thread.run(Thread.java:833) [?:?]"] }

In Java faced below exception:
[org.springframework.dao.DataAccessResourceFailureException: 10,000 milliseconds timeout on connection http-outgoing-6 [ACTIVE]; nested exception is java.lang.RuntimeException: 10,000 milliseconds timeout on connection http-outgoing-6 [ACTIVE]
at org.springframework.data.elasticsearch.core.ElasticsearchExceptionTranslator.translateExceptionIfPossible(ElasticsearchExceptionTranslator.java:94)
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.translateException(ElasticsearchRestTemplate.java:601)
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.execute(ElasticsearchRestTemplate.java:584)
at org.springframework.data.elasticsearch.core.RestIndexTemplate.doRefresh(RestIndexTemplate.java:178)
at org.springframework.data.elasticsearch.core.AbstractIndexTemplate.refresh(AbstractIndexTemplate.java:163)
at org.springframework.data.elasticsearch.repository.support.SimpleElasticsearchRepository.doRefresh(SimpleElasticsearchRepository.java:313)
at org.springframework.data.elasticsearch.repository.support.SimpleElasticsearchRepository.executeAndRefresh(SimpleElasticsearchRepository.java:361)
at org.springframework.data.elasticsearch.repository.support.SimpleElasticsearchRepository.save(SimpleElasticsearchRepository.java:188)
at jdk.internal.reflect.GeneratedMethodAccessor539.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.data.repository.core.support.RepositoryMethodInvoker$RepositoryFragmentMethodInvoker.lambda$new$0(RepositoryMethodInvoker.java:289)
at org.springframework.data.repository.core.support.RepositoryMethodInvoker.doInvoke(RepositoryMethodInvoker.java:137)
at org.springframework.data.repository.core.support.RepositoryMethodInvoker.invoke(RepositoryMethodInvoker.java:121)
at org.springframework.data.repository.core.support.RepositoryComposition$RepositoryFragments.invoke(RepositoryComposition.java:530)
at org.springframework.data.repository.core.support.RepositoryComposition.invoke(RepositoryComposition.java:286)
at org.springframework.data.repository.core.support.RepositoryFactorySupport$ImplementationMethodExecutionInterceptor.invoke(RepositoryFactorySupport.java:640)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.doInvoke(QueryExecutorMethodInterceptor.java:164)
at org.springframework.data.repository.core.support.QueryExecutorMethodInterceptor.invoke(QueryExecutorMethodInterceptor.java:139)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
at com.sun.proxy.$Proxy250.save(Unknown Source)
at com.dfi.infinity.iapi.iapirest.ResponseManager.saveIapiResponseEntity(ResponseManager.java:446)
at com.dfi.infinity.iapi.iapirest.IAPIRestActor.sendRequest(IAPIRestActor.java:155)
at com.dfi.infinity.iapi.iapirest.IAPIRestActor.onReceive(IAPIRestActor.java:103)
at akka.actor.UntypedAbstractActor$$anon$1.applyOrElse(AbstractActor.scala:332)
at akka.actor.Actor.aroundReceive(Actor.scala:537)
at akka.actor.Actor.aroundReceive$(Actor.scala:471)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
at akka.actor.ActorCell.invoke(ActorCell.scala:548)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
at akka.dispatch.Mailbox.run(Mailbox.scala:231)
at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: java.lang.RuntimeException: 10,000 milliseconds timeout on connection http-outgoing-6 [ACTIVE]
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.translateException(ElasticsearchRestTemplate.java:599)
... 40 more
Caused by: java.net.SocketTimeoutException: 10,000 milliseconds timeout on connection http-outgoing-6 [ACTIVE]
at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:917)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:300)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:288)
at org.elasticsearch.client.RestHighLevelClient.performClientRequest(RestHighLevelClient.java:2699)
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:2171)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:2137)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:2105)
at org.elasticsearch.client.IndicesClient.refresh(IndicesClient.java:900)
at org.springframework.data.elasticsearch.core.RestIndexTemplate.lambda$doRefresh$8(RestIndexTemplate.java:178)
at org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate.execute(ElasticsearchRestTemplate.java:582)
... 39 more
Caused by: java.net.SocketTimeoutException: 10,000 milliseconds timeout on connection http-outgoing-6 [ACTIVE]
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387)
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:98)
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:40)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175)
at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:261)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:506)
at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:211)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
at java.base/java.lang.Thread.run(Thread.java:829)

We are currently using Elasticsearch version 7.17.5 and spring-data-elasticsearch version 4.4.2.

we have don the mount for the elastic container data in VM local path
/opt/elasticsearch-data:/usr/share/elasticsearch/data

Elastic running as a docker container.

Has anyone else encountered a similar issue? how to resolve this one?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.