Elasticsearch single node "cluster" fails with "incoming term X does not match current term Y"

Hi there,

We have a recurring error in our test environment, where elasticsearch fails to boot up. We have logs from two occurrence, but sadly we can't reproduce the error consistently to help determine what the problem might be. It is a single node setup.

I will post the logs of the errors in separate posts under this topic, because of character limit.

The two errors are not the same, however the causes of them look very similar (incoming term X does not match current term Y). I looked online, to no avail (I only found one relevant post, which didn't help). I also looked at the code, trying to figure out what can cause such an error, and it looked to me that the mismatch of these - incoming and current - terms would mean a kind of desynchronization between nodes, but this is a single node setup, so maybe I misunderstood something.

What could be the problem, and how can we mitigate it? Please ask if there's any other information (besides the logs below) is needed to tell what's going on here. Thank you.

[2020-10-21T00:58:24,874][INFO ][o.e.t.TransportService   ] [scb1] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2020-10-21T00:58:26,873][INFO ][o.e.c.c.Coordinator      ] [scb1] cluster UUID [BJHT2fLHQCynGCURVTaEIQ]
[2020-10-21T00:58:29,264][INFO ][o.e.c.c.JoinHelper       ] [scb1] failed to join {scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr} with JoinRequest{sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, minimumTerm=2, optionalJoin=Optional[Join{term=3, lastAcceptedTerm=2, lastAcceptedVersion=17, sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, targetNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}}]}
org.elasticsearch.transport.RemoteTransportException: [scb1][127.0.0.1:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 3 does not match current term 5
	at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:225) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:982) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.Optional.ifPresent(Optional.java:159) ~[?:1.8.0_265]
	at org.elasticsearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:527) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.lambda$handleJoinRequest$7(Coordinator.java:491) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:372) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:359) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoinRequest(Coordinator.java:478) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$0(JoinHelper.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:767) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
[2020-10-21T00:58:29,264][INFO ][o.e.c.c.JoinHelper       ] [scb1] failed to join {scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr} with JoinRequest{sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, minimumTerm=3, optionalJoin=Optional[Join{term=4, lastAcceptedTerm=2, lastAcceptedVersion=17, sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, targetNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}}]}
org.elasticsearch.transport.RemoteTransportException: [scb1][127.0.0.1:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 4 does not match current term 5
	at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:225) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:982) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.Optional.ifPresent(Optional.java:159) ~[?:1.8.0_265]
	at org.elasticsearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:527) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.lambda$handleJoinRequest$7(Coordinator.java:491) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:372) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:359) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoinRequest(Coordinator.java:478) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$0(JoinHelper.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:767) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
[2020-10-21T00:58:29,810][INFO ][o.e.c.s.MasterService    ] [scb1] elected-as-master ([1] nodes joined)[{scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 6, version: 18, delta: master node changed {previous [], current [{scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}]}
[2020-10-21T00:58:30,907][INFO ][o.e.c.s.ClusterApplierService] [scb1] master node changed {previous [], current [{scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}]}, term: 6, version: 18, reason: Publication{term=6, version=18}
2020-11-09T02:06:31,944][INFO ][o.e.n.Node               ] [scb2] stopping ...
[2020-11-09T02:06:32,188][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [scb2] failed to execute on node [ZYiGohE7QBikNcnuPV-TWA]
org.elasticsearch.transport.SendRequestTransportException: [scb2][127.0.0.1:9300][cluster:monitor/nodes/stats[n]]
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:719) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:621) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:603) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.start(TransportNodesAction.java:182) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction.doExecute(TransportNodesAction.java:82) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction.doExecute(TransportNodesAction.java:51) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:153) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:64) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:399) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.execute(AbstractClient.java:688) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.nodesStats(AbstractClient.java:784) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.updateNodeStats(InternalClusterInfoService.java:249) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.refresh(InternalClusterInfoService.java:291) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.maybeRefresh(InternalClusterInfoService.java:271) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.access$200(InternalClusterInfoService.java:68) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService$SubmitReschedulingClusterInfoUpdatedJob.lambda$run$0(InternalClusterInfoService.java:221) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_272]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_272]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
Caused by: org.elasticsearch.node.NodeClosedException: node closed {scb2}{ZYiGohE7QBikNcnuPV-TWA}{netSmENiRS-UYLYgxU4hEQ}{127.0.0.1}{127.0.0.1:9300}{dimr}
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:701) ~[elasticsearch-7.7.0.jar:7.7.0]
	... 22 more
[2020-11-09T02:06:32,191][DEBUG][o.e.a.a.i.s.TransportIndicesStatsAction] [scb2] failed to execute [indices:monitor/stats] on node [ZYiGohE7QBikNcnuPV-TWA]
org.elasticsearch.transport.SendRequestTransportException: [scb2][127.0.0.1:9300][indices:monitor/stats[n]]
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:719) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:621) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:588) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.sendNodeRequest(TransportBroadcastByNodeAction.java:313) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.start(TransportBroadcastByNodeAction.java:302) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:225) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:76) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:153) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:64) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:399) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.execute(AbstractClient.java:1234) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.stats(AbstractClient.java:1550) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.updateIndicesStats(InternalClusterInfoService.java:264) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.refresh(InternalClusterInfoService.java:321) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.maybeRefresh(InternalClusterInfoService.java:271) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.access$200(InternalClusterInfoService.java:68) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService$SubmitReschedulingClusterInfoUpdatedJob.lambda$run$0(InternalClusterInfoService.java:221) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_272]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_272]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
Caused by: org.elasticsearch.node.NodeClosedException: node closed {scb2}{ZYiGohE7QBikNcnuPV-TWA}{netSmENiRS-UYLYgxU4hEQ}{127.0.0.1}{127.0.0.1:9300}{dimr}
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:701) ~[elasticsearch-7.7.0.jar:7.7.0]
	... 23 more
[2020-11-09T02:06:32,236][INFO ][o.e.n.Node               ] [scb2] stopped
[2020-11-09T02:06:32,236][INFO ][o.e.n.Node               ] [scb2] closing ...
[2020-11-09T02:06:32,311][INFO ][o.e.n.Node               ] [scb2] closed

The INFO messages you shared can happen sometimes, but if you only see a couple at node startup then they don't indicate a problem.

The second message is normal, you've signalled the node to shut down which means that some ongoing work fails (i.e. is cancelled or rejected).

Thank you for the quick answer.

Can these unhandled exceptions cause translog corruption, or any other kind of loss of data?

No, definitely not.

They aren't unhandled either, they're logged (at INFO and DEBUG respectively) as part of the handling process. Unhandled exceptions would immediately terminate the whole node.

Thank you for the clarification, I'm going to close the thread now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.