Elasticsearch single node "cluster" fails with "incoming term X does not match current term Y"

tobal · November 27, 2020, 9:29am

Hi there,

We have a recurring error in our test environment, where elasticsearch fails to boot up. We have logs from two occurrence, but sadly we can't reproduce the error consistently to help determine what the problem might be. It is a single node setup.

I will post the logs of the errors in separate posts under this topic, because of character limit.

The two errors are not the same, however the causes of them look very similar (incoming term X does not match current term Y). I looked online, to no avail (I only found one relevant post, which didn't help). I also looked at the code, trying to figure out what can cause such an error, and it looked to me that the mismatch of these - incoming and current - terms would mean a kind of desynchronization between nodes, but this is a single node setup, so maybe I misunderstood something.

What could be the problem, and how can we mitigate it? Please ask if there's any other information (besides the logs below) is needed to tell what's going on here. Thank you.

tobal · November 27, 2020, 9:31am

[2020-10-21T00:58:24,874][INFO ][o.e.t.TransportService   ] [scb1] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2020-10-21T00:58:26,873][INFO ][o.e.c.c.Coordinator      ] [scb1] cluster UUID [BJHT2fLHQCynGCURVTaEIQ]
[2020-10-21T00:58:29,264][INFO ][o.e.c.c.JoinHelper       ] [scb1] failed to join {scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr} with JoinRequest{sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, minimumTerm=2, optionalJoin=Optional[Join{term=3, lastAcceptedTerm=2, lastAcceptedVersion=17, sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, targetNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}}]}
org.elasticsearch.transport.RemoteTransportException: [scb1][127.0.0.1:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 3 does not match current term 5
	at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:225) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:982) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.Optional.ifPresent(Optional.java:159) ~[?:1.8.0_265]
	at org.elasticsearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:527) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.lambda$handleJoinRequest$7(Coordinator.java:491) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:372) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:359) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoinRequest(Coordinator.java:478) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$0(JoinHelper.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:767) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
[2020-10-21T00:58:29,264][INFO ][o.e.c.c.JoinHelper       ] [scb1] failed to join {scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr} with JoinRequest{sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, minimumTerm=3, optionalJoin=Optional[Join{term=4, lastAcceptedTerm=2, lastAcceptedVersion=17, sourceNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}, targetNode={scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}}]}
org.elasticsearch.transport.RemoteTransportException: [scb1][127.0.0.1:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 4 does not match current term 5
	at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:225) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:982) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.Optional.ifPresent(Optional.java:159) ~[?:1.8.0_265]
	at org.elasticsearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:527) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.lambda$handleJoinRequest$7(Coordinator.java:491) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:372) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:359) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoinRequest(Coordinator.java:478) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$0(JoinHelper.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:767) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
[2020-10-21T00:58:29,810][INFO ][o.e.c.s.MasterService    ] [scb1] elected-as-master ([1] nodes joined)[{scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 6, version: 18, delta: master node changed {previous [], current [{scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}]}
[2020-10-21T00:58:30,907][INFO ][o.e.c.s.ClusterApplierService] [scb1] master node changed {previous [], current [{scb1}{JU_QmoZER9yA8MDKIxklEA}{ORaU5I2tTJWM3KlyEItHwg}{127.0.0.1}{127.0.0.1:9300}{dimr}]}, term: 6, version: 18, reason: Publication{term=6, version=18}

tobal · November 27, 2020, 9:32am

2020-11-09T02:06:31,944][INFO ][o.e.n.Node               ] [scb2] stopping ...
[2020-11-09T02:06:32,188][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [scb2] failed to execute on node [ZYiGohE7QBikNcnuPV-TWA]
org.elasticsearch.transport.SendRequestTransportException: [scb2][127.0.0.1:9300][cluster:monitor/nodes/stats[n]]
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:719) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:621) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:603) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.start(TransportNodesAction.java:182) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction.doExecute(TransportNodesAction.java:82) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.nodes.TransportNodesAction.doExecute(TransportNodesAction.java:51) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:153) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:64) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:399) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.execute(AbstractClient.java:688) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.nodesStats(AbstractClient.java:784) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.updateNodeStats(InternalClusterInfoService.java:249) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.refresh(InternalClusterInfoService.java:291) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.maybeRefresh(InternalClusterInfoService.java:271) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.access$200(InternalClusterInfoService.java:68) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService$SubmitReschedulingClusterInfoUpdatedJob.lambda$run$0(InternalClusterInfoService.java:221) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_272]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_272]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
Caused by: org.elasticsearch.node.NodeClosedException: node closed {scb2}{ZYiGohE7QBikNcnuPV-TWA}{netSmENiRS-UYLYgxU4hEQ}{127.0.0.1}{127.0.0.1:9300}{dimr}
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:701) ~[elasticsearch-7.7.0.jar:7.7.0]
	... 22 more
[2020-11-09T02:06:32,191][DEBUG][o.e.a.a.i.s.TransportIndicesStatsAction] [scb2] failed to execute [indices:monitor/stats] on node [ZYiGohE7QBikNcnuPV-TWA]
org.elasticsearch.transport.SendRequestTransportException: [scb2][127.0.0.1:9300][indices:monitor/stats[n]]
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:719) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:621) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:588) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.sendNodeRequest(TransportBroadcastByNodeAction.java:313) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.start(TransportBroadcastByNodeAction.java:302) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:225) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:76) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:153) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:129) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:64) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:399) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.execute(AbstractClient.java:1234) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.stats(AbstractClient.java:1550) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.updateIndicesStats(InternalClusterInfoService.java:264) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.refresh(InternalClusterInfoService.java:321) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.maybeRefresh(InternalClusterInfoService.java:271) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService.access$200(InternalClusterInfoService.java:68) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.cluster.InternalClusterInfoService$SubmitReschedulingClusterInfoUpdatedJob.lambda$run$0(InternalClusterInfoService.java:221) ~[elasticsearch-7.7.0.jar:7.7.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633) ~[elasticsearch-7.7.0.jar:7.7.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_272]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_272]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
Caused by: org.elasticsearch.node.NodeClosedException: node closed {scb2}{ZYiGohE7QBikNcnuPV-TWA}{netSmENiRS-UYLYgxU4hEQ}{127.0.0.1}{127.0.0.1:9300}{dimr}
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:701) ~[elasticsearch-7.7.0.jar:7.7.0]
	... 23 more
[2020-11-09T02:06:32,236][INFO ][o.e.n.Node               ] [scb2] stopped
[2020-11-09T02:06:32,236][INFO ][o.e.n.Node               ] [scb2] closing ...
[2020-11-09T02:06:32,311][INFO ][o.e.n.Node               ] [scb2] closed

DavidTurner · November 27, 2020, 9:34am

The INFO messages you shared can happen sometimes, but if you only see a couple at node startup then they don't indicate a problem.

The second message is normal, you've signalled the node to shut down which means that some ongoing work fails (i.e. is cancelled or rejected).

tobal · November 30, 2020, 7:26am

Thank you for the quick answer.

Can these unhandled exceptions cause translog corruption, or any other kind of loss of data?

DavidTurner · November 30, 2020, 8:10am

No, definitely not.

They aren't unhandled either, they're logged (at INFO and DEBUG respectively) as part of the handling process. Unhandled exceptions would immediately terminate the whole node.

tobal · November 30, 2020, 1:22pm

Thank you for the clarification, I'm going to close the thread now.

system · December 28, 2020, 1:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Very weird ES Cluster state problem! Elasticsearch	8	544	July 6, 2017
Another odd ES freak out Elasticsearch	6	536	July 6, 2017
ElasticSearch cannot join cluster Elasticsearch	7	601	July 6, 2017
Elastic search nodes only cluster together when started within a few minutes of eachother Elasticsearch	4	466	July 6, 2017
Network interruption, some nodes not recovering Elasticsearch	1	366	July 6, 2017

Elasticsearch single node "cluster" fails with "incoming term X does not match current term Y"

Related topics