Failed to start elastic search service after upgrade from version 8.2 to 8.8

Hello community, after updating a cluster that contains 3 master nodes and 3 data nodes (ingest), the master nodes work normally, but the ingest nodes do not start the elasticsearch service, activating the DEBUG mode, returns this message:

[2023-06-02T09:01:53,567][INFO ][o.e.x.w.WatcherService   ] [p-elk-dashdb-03.environment.int] stopping watch service, reason [shutdown initiated]
[2023-06-02T09:01:53,569][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [p-elk-dashdb-03.environment.int] [controller/9946] [Main.cc@176] ML controller exiting
[2023-06-02T09:01:53,569][INFO ][o.e.x.w.WatcherLifeCycleService] [p-elk-dashdb-03.environment.int] watcher has stopped and shutdown
[2023-06-02T09:01:53,571][INFO ][o.e.x.m.p.NativeController] [p-elk-dashdb-03.environment.int] Native controller process has stopped - no new native processes can be started
[2023-06-02T09:01:53,694][INFO ][o.e.c.c.Coordinator      ] [p-elk-dashdb-03.environment.int] master node [{p-elk-master-01.environment.int}{_gikQevEQh6uKcAp1IXcVQ}{t7YzDST1QgGQlbmqTm2O5Q}{p-elk-master-01.environment.int}{172.21.37.34}{172.50.9.12:9300}{mr}{8.8.0}] disconnected, restarting discovery
[2023-06-02T09:01:53,695][DEBUG][o.e.d.PeerFinder         ] [p-elk-dashdb-03.environment.int] address [172.50.9.12:9300], node [null], requesting [false] discovery result
org.elasticsearch.transport.ConnectTransportException: [][172.50.9.12:9300] connection manager is closed
        at org.elasticsearch.transport.ClusterConnectionManager.openConnection(ClusterConnectionManager.java:90) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.transport.TransportService.openConnection(TransportService.java:509) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.HandshakingTransportAddressConnector.connectToRemoteMasterNode(HandshakingTransportAddressConnector.java:78) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder$Peer.establishConnection(PeerFinder.java:379) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.startProbe(PeerFinder.java:324) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator$CoordinatorPeerFinder.startProbe(Coordinator.java:1650) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.handleWakeUp(PeerFinder.java:278) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.activate(PeerFinder.java:126) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator.becomeCandidate(Coordinator.java:856) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator.onLeaderFailure(Coordinator.java:343) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.LeaderChecker$CheckScheduler$2.doRun(LeaderChecker.java:341) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.0.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.lang.Thread.run(Thread.java:1623) ~[?:?]
[2023-06-02T09:01:53,698][DEBUG][o.e.d.PeerFinder         ] [p-elk-dashdb-03.environment.int] address [172.21.37.35:9300], node [null], requesting [false] discovery result
org.elasticsearch.transport.ConnectTransportException: [][172.21.37.35:9300] connection manager is closed
        at org.elasticsearch.transport.ClusterConnectionManager.openConnection(ClusterConnectionManager.java:90) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.transport.TransportService.openConnection(TransportService.java:509) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.HandshakingTransportAddressConnector.connectToRemoteMasterNode(HandshakingTransportAddressConnector.java:78) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder$Peer.establishConnection(PeerFinder.java:379) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.startProbe(PeerFinder.java:324) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator$CoordinatorPeerFinder.startProbe(Coordinator.java:1650) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.handleWakeUp(PeerFinder.java:278) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.activate(PeerFinder.java:126) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator.becomeCandidate(Coordinator.java:856) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator.onLeaderFailure(Coordinator.java:343) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.LeaderChecker$CheckScheduler$2.doRun(LeaderChecker.java:341) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.0.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.lang.Thread.run(Thread.java:1623) ~[?:?]
[2023-06-02T09:01:53,699][DEBUG][o.e.d.PeerFinder         ] [p-elk-dashdb-03.environment.int] address [172.21.37.36:9300], node [null], requesting [false] discovery result
org.elasticsearch.transport.ConnectTransportException: [][172.21.37.36:9300] connection manager is closed
        at org.elasticsearch.transport.ClusterConnectionManager.openConnection(ClusterConnectionManager.java:90) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.transport.TransportService.openConnection(TransportService.java:509) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.HandshakingTransportAddressConnector.connectToRemoteMasterNode(HandshakingTransportAddressConnector.java:78) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder$Peer.establishConnection(PeerFinder.java:379) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.startProbe(PeerFinder.java:324) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator$CoordinatorPeerFinder.startProbe(Coordinator.java:1650) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.handleWakeUp(PeerFinder.java:278) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.discovery.PeerFinder.activate(PeerFinder.java:126) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator.becomeCandidate(Coordinator.java:856) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.Coordinator.onLeaderFailure(Coordinator.java:343) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.cluster.coordination.LeaderChecker$CheckScheduler$2.doRun(LeaderChecker.java:341) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.8.0.jar:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.0.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.lang.Thread.run(Thread.java:1623) ~[?:?]
[2023-06-02T09:01:53,700][DEBUG][o.e.d.SeedHostsResolver  ] [p-elk-dashdb-03.environment.int] resolveConfiguredHosts: lifecycle is STOPPED, not proceeding
[2023-06-02T09:01:54,082][INFO ][o.e.n.Node               ] [p-elk-dashdb-03.environment.int] stopped
[2023-06-02T09:01:54,083][INFO ][o.e.n.Node               ] [p-elk-dashdb-03.environment.int] closing ...
[2023-06-02T09:01:54,118][INFO ][o.e.n.Node               ] [p-elk-dashdb-03.environment.int] closed

P.S. Before the update it was working fine. Can anyone help me?

From that log it looks like something has shutdown Elasticsearch ?

Is there more to the log you can share?

Hello @warkolm , thank you for the answer.

No. The log stop here and i have not anything information, in the linux log too.
If a change the timeout config in the elasticsearch service to some big number, the service still starting and nothing happend, until the end come and the service stop, i really don't know what i can do, because, before the atualization, the service was working normal.

P.S. This is my test cluster.

What about before though?

@warkolm I've identified these erro, but i dont know what can i do:

[2023-06-06T11:43:46,166][ERROR][o.e.x.d.l.DeprecationIndexingComponent] [p-elk-dashdb-01.environment.int] Bulk write of deprecation logs encountered some failures: [[iGJfkYgBQ_l3o_VP-WZA org.elasticsearch.action.UnavailableShardsException: [.ds-.logs-deprecation.elasticsearch-default-2023.06.06-000026][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[.ds-.logs-deprecation.elasticsearch-default-2023.06.06-000026][0]] containing [index {[.logs-deprecation.elasticsearch-default][iGJfkYgBQ_l3o_VP-WZA], source[{"@timestamp":"2023-06-06T15:42:41.917Z", "log.level": "WARN",  "data_stream.dataset":"deprecation.elasticsearch","data_stream.namespace":"default","data_stream.type":"logs","elasticsearch.event.category":"settings","event.code":"xpack.monitoring.collection.enabled","message":"[xpack.monitoring.collection.enabled] setting was deprecated in Elasticsearch and will be removed in a future release." , "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"deprecation.elasticsearch","process.thread.name":"main","log.logger":"org.elasticsearch.deprecation.common.settings.Settings","elasticsearch.node.name":"p-elk-dashdb-01.environment.int","elasticsearch.cluster.name":"p-elk"}
]}]]]]

Hello, i finded the problema.
In the elasticsearch service file: /usr/lib/systemd/system/elasticsearch.service was missing this config: NotifyAccess=all
Aplliyng this, the service works without erros.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.