Recreation of ES data node failed

KOTOXJle6 · September 6, 2020, 11:30am

Good afternoon.

There was an elasticsearch cluster consisting of three data nodes and one coordinator with Kibana. It happened that one server died and was re-created. The ES version is the same, Java is the same, and the configuration in elasticsearch. yml is the same.

The service started, there are no errors, but this node did not appear in cluster monitoring. curl localhost:9200/_cat/nodes/ doesnt show problem node.

Please tell me what the problem may be?

ylasri · September 6, 2020, 11:39am

Can you share log file from the started node ?

KOTOXJle6 · September 6, 2020, 11:47am

Cant put full log from /var/log/elasticsearch/cluster.log, its too big.

Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
	at org.elasticsearch.cluster.coordination.Coordinator$3.onFailure(Coordinator.java:500) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.coordination.JoinHelper$5.handleException(JoinHelper.java:359) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1124) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.transport.TcpTransport.lambda$handleException$24(TcpTransport.java:1001) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.1.0.jar:7.1.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
Caused by: org.elasticsearch.transport.RemoteTransportException: [elk-es-node-02][10.199.5.104:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid VdZPYWRCT8eMvjFOpWB6Lw than local cluster uuid 693EJRStTo-3GwqanfvZkA, rejecting
	at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:147) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:251) ~[?:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:309) ~[?:?]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1077) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.1.0.jar:7.1.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_211]
[2020-09-06T14:51:13,609][INFO ][o.e.c.c.JoinHelper       ] [elk-es-node-02] failed to join {elk-es-node-03}{sPaS62bdS8CJrY9MWNUkgQ}{eToptbY8TFahH6E1iv_RLQ}{10.199.5.105}{10.199.5.105:9300}{ml.machine_memory=12592672768, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={elk-es-node-02}{u2qN94UbTt6hHKl_ShftVw}{wsLKW5uGTUSnM6_YjvmxIQ}{10.199.5.104}{10.199.5.104:9300}{ml.machine_memory=12564250624, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional.empty}
org.elasticsearch.transport.RemoteTransportException: [elk-es-node-03][10.199.5.105:9300][internal:cluster/coordination/join]
Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
	at org.elasticsearch.cluster.coordination.Coordinator$3.onFailure(Coordinator.java:500) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.coordination.JoinHelper$5.handleException(JoinHelper.java:359) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1124) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.transport.TcpTransport.lambda$handleException$24(TcpTransport.java:1001) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.1.0.jar:7.1.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
Caused by: org.elasticsearch.transport.RemoteTransportException: [elk-es-node-02][10.199.5.104:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid VdZPYWRCT8eMvjFOpWB6Lw than local cluster uuid 693EJRStTo-3GwqanfvZkA, rejecting
	at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:147) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:251) ~[?:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:309) ~[?:?]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1077) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.1.0.jar:7.1.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_211]
[2020-09-06T14:51:14,115][WARN ][r.suppressed             ] [elk-es-node-02] path: /_license, params: {}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:259) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:555) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-7.1.0.jar:7.1.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
[2020-09-06T14:51:14,356][WARN ][r.suppressed             ] [elk-es-node-02] path: /_license, params: {}
org.elasticsearch.discovery.MasterNotDiscoveredException: null
	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:259) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:555) [elasticsearch-7.1.0.jar:7.1.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-7.1.0.jar:7.1.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

Christian_Dahlqvist · September 6, 2020, 11:55am

It does not seem like the new node is configured to link up with the other nodes. What does your elasticsearch.yml config file look like?

ylasri · September 6, 2020, 11:56am

Looks a single node and not part of a cluster
Make sure you use same cluster config from your existing setup

KOTOXJle6 · September 6, 2020, 12:00pm

Sorry, i've post elasticsearch.log at first time. Now its updated to cluster.log.

KOTOXJle6 · September 6, 2020, 12:20pm

Looks like the problem in this "Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid VdZPYWRCT8eMvjFOpWB6Lw than local cluster uuid 693EJRStTo-3GwqanfvZkA, rejecting". But i dont understand how that can be fixed.

Christian_Dahlqvist · September 6, 2020, 12:42pm

I think the new node has been started up without the correct config and created a cluster of its own. Wipe the storage of the new node and launch it with the correct config as that should allow it to join.

KOTOXJle6 · September 6, 2020, 12:47pm

It worked! I realized that the problem was related to the fact that I first started ES with default configuration to check .conf, which caused a new cluster to be created, and then tried to attach the node to the old cluster and got an error.

Thanks all!

system · October 4, 2020, 12:47pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to join to my new cluster Elasticsearch	2	351	August 27, 2020
Cluster Nodes are not reflecting Elasticsearch	5	418	April 12, 2021
Discover: An error occurred with your request. Reset your inputs and try again Kibana	6	3290	July 6, 2017
None of the configured nodes are available Elasticsearch	4	362	May 23, 2023
Repeated cluster failures in multi-node cluster Elasticsearch	20	1481	April 3, 2020

Recreation of ES data node failed

Related topics