Elastic cluster - 3nodes (1master - 2 data)

Hello,

I have 3 nodes and started to create a cluster.

1 master and 2 data

the problem is that I cannot connect the 3rd node (2nd data) at the cluster.

[root@elastic1 ~]# curl localhost:9200/_cat/nodes          
172.27.52.56 22 70 4 0.00 0.09 0.14 di - data1
172.27.52.55 16 45 0 0.00 0.03 0.09 mi * master

the message I get at the 3rd node is:
Caused by: org.elasticsearch.transport.RemoteTransportException: [data2][172.27.52.57:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 7UiHGr-tRNG2PPoJ6Po5Nw than local cluster uuid Tx77PhoRTv6KUsXkON5DGA, rejecting

the configurations of the nodes are the below:

master node

[root@elastic1 ~]# cat /etc/elasticsearch/elasticsearch.yml      


cluster.name: nms_elastic

cluster.initial_master_nodes: 172.27.52.55
node.master: true
node.data: false


# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: [_local_, _site_]
#
# Set a custom port for HTTP:
#
http.port: 9200




**node data1**

[root@elastic2 ~]# vi /etc/elasticsearch/elasticsearch.yml
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: nms_elastic
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: data1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
node.master: false
node.data: true
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#

# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: [_local_, _site_]
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["172.27.52.55"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:



**node data2**


[root@elastic3 ~]# cat /etc/elasticsearch/elasticsearch.yml      
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: nms_elastic
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: data2
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
cluster.initial_master_nodes: 172.27.52.55
node.master: false
node.data: true
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: [_local_, _site_]
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["172.27.52.55"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["172.27.52.55"]
#
#discovery.zen.minimum_master_nodes: 
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true

Thank you in advance for your help

This note in the docs tells you what to do in this situation.

Also please format your post better in future. Its terrible formatting makes it almost unreadable and therefore very unlikely to receive an answer.

Hello,

What I am not able to understand is why Master server is able to see data1 and not data2 since the configuration is the same.

whatever changes I made the 3rd node is not able to see the other two (and the opposite)

[root@elastic3 bin]# curl localhost:9200
{
"name" : "data2",
"cluster_name" : "nms_elastic",
"cluster_uuid" : "na",
"version" : {
"number" : "7.2.0",
"build_flavor" : "default",
"build_type" : "rpm",
"build_hash" : "508c38a",
"build_date" : "2019-06-20T15:54:18.811730Z",
"build_snapshot" : false,
"lucene_version" : "8.0.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}

The doc that you provided is not very helpful.

thank you

Please format your posts better to make it easier to help. You can see a preview of your post when you're writing it, to check that it looks right. Use things like the </> button to format things as you are seeing them.

The information you've shared here suggests you're now seeing a different problem, but doesn't contain a lot of extra information. Please share the logs now, properly formatted.

ok. I changed the cluster.name to nms

below are only the areas that we use


[root@elastic1 elasticsearch]# cat elasticsearch.yml                         

cluster.name: nms
cluster.initial_master_nodes: 172.27.52.55
node.master: true
node.data: false



network.host: [_local_, _site_]



#
#Set a custom port for HTTP:
#
http.port: 9200

discovery.seed_hosts: ["127.0.0.1"]


----------------------------------------------------------------------------------------------------

[root@elastic2 elasticsearch]# cat elasticsearch.yml                         

cluster.name: nms
node.name: data1
node.master: false
node.data: true

network.host: [_local_, _site_]
#
#Set a custom port for HTTP:
#
http.port: 9200

discovery.seed_hosts: ["172.27.52.55"]


----------------------------------------------------------------------------------------------------

[root@elastic3 elasticsearch]# cat elasticsearch.yml 

cluster.name: nms
node.name: data2
node.master: false
node.data: true

network.host: [_local_, _site_]
#
#Set a custom port for HTTP:
#
http.port: 9200

discovery.seed_hosts: ["172.27.52.55"]

(I hope format is ok now)
Thank you

No, still badly-formatted. This kind of thing is very confusing:

image

You must use the </> button to format things exactly how you are seeing them:

network.host: [_local_, _site_]

Note the vitally-important underscores!

There's not nearly enough information to help here. We need to see the logs.

Master

[root@elastic1 elasticsearch]# tail -f /var/log/elasticsearch/nms.log

[2019-07-15T12:38:18,586][INFO ][o.e.n.Node               ] [master] starting ...
[2019-07-15T12:38:18,763][INFO ][o.e.t.TransportService   ] [master] publish_address {172.27.52.55:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}, {172.27.52.55:9300}
[2019-07-15T12:38:18,774][INFO ][o.e.b.BootstrapChecks    ] [master] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-07-15T12:38:18,862][INFO ][o.e.c.c.Coordinator      ] [master] cluster UUID [7UiHGr-tRNG2PPoJ6Po5Nw]
[2019-07-15T12:38:18,989][INFO ][o.e.c.s.MasterService    ] [master] elected-as-master ([1] nodes joined)[{master}{JlguofZdQvWhiaVGat8v6w}{U2Gh6B_ZRUWHIljz1TxnFw}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 4, version: 25, reason: master node changed {previous [], current [{master}{JlguofZdQvWhiaVGat8v6w}{U2Gh6B_ZRUWHIljz1TxnFw}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, xpack.installed=true, ml.max_open_jobs=20}]}
[2019-07-15T12:38:19,044][INFO ][o.e.c.s.ClusterApplierService] [master] master node changed {previous [], current [{master}{JlguofZdQvWhiaVGat8v6w}{U2Gh6B_ZRUWHIljz1TxnFw}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, xpack.installed=true, ml.max_open_jobs=20}]}, term: 4, version: 25, reason: Publication{term=4, version=25}
[2019-07-15T12:38:19,123][INFO ][o.e.h.AbstractHttpServerTransport] [master] publish_address {172.27.52.55:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}, {172.27.52.55:9200}
[2019-07-15T12:38:19,124][INFO ][o.e.n.Node               ] [master] started
[2019-07-15T12:38:19,277][WARN ][o.e.c.c.Coordinator      ] [master] failed to validate incoming join request from node [{data2}{0L8h9xYNToexfdx8MLLMAQ}{Mu1RxulUQN60hPe4Tx4oqQ}{172.27.52.57}{172.27.52.57:9300}{ml.machine_memory=16656531456, ml.max_open_jobs=20, xpack.installed=true}]
org.elasticsearch.transport.RemoteTransportException: [data2][172.27.52.57:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 7UiHGr-tRNG2PPoJ6Po5Nw than local cluster uuid Tx77PhoRTv6KUsXkON5DGA, rejecting
        at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:147) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:250) ~[?:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:308) ~[?:?]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:267) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:758) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.2.0.jar:7.2.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:835) [?:?]
[2019-07-15T12:38:19,432][INFO ][o.e.l.LicenseService     ] [master] license [f480cac7-d8d0-4b23-b727-513c307d8a3e] mode [basic] - valid
[2019-07-15T12:38:19,444][INFO ][o.e.g.GatewayService     ] [master] recovered [0] indices into cluster_state
[2019-07-15T12:38:19,460][INFO ][o.e.c.s.MasterService    ] [master] node-join[{data1}{b0WxAo8qQ-e9Oq6Nk1vRLQ}{HeQ4HjypRLWEkgDcTI59kg}{172.27.52.56}{172.27.52.56:9300}{ml.machine_memory=10314784768, ml.max_open_jobs=20, xpack.installed=true} join existing leader], term: 4, version: 28, reason: added {{data1}{b0WxAo8qQ-e9Oq6Nk1vRLQ}{HeQ4HjypRLWEkgDcTI59kg}{172.27.52.56}{172.27.52.56:9300}{ml.machine_memory=10314784768, ml.max_open_jobs=20, xpack.installed=true},}
[2019-07-15T12:38:20,079][WARN ][o.e.c.c.Coordinator      ] [master] failed to validate incoming join request from node [{data2}{0L8h9xYNToexfdx8MLLMAQ}{Mu1RxulUQN60hPe4Tx4oqQ}{172.27.52.57}{172.27.52.57:9300}{ml.machine_memory=16656531456, ml.max_open_jobs=20, xpack.installed=true}]
org.elasticsearch.transport.RemoteTransportException: [data2][172.27.52.57:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 7UiHGr-tRNG2PPoJ6Po5Nw than local cluster uuid Tx77PhoRTv6KUsXkON5DGA, rejecting
        at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:147) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:250) ~[?:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:308) ~[?:?]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:267) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:758) ~[elasticsearch-7.2.0.jar:7.2.0]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.2.0.jar:7.2.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:835) [?:?]

data1

    [root@elastic2 elasticsearch]# tail -f /var/log/elasticsearch/nms.log        
    [2019-07-15T11:21:30,552][INFO ][o.e.n.Node               ] [data1] starting ...
    [2019-07-15T11:21:30,733][INFO ][o.e.t.TransportService   ] [data1] publish_address {172.27.52.56:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}, {172.27.52.56:9300}
    [2019-07-15T11:21:30,743][INFO ][o.e.b.BootstrapChecks    ] [data1] bound or publishing to a non-loopback address, enforcing bootstrap checks
    [2019-07-15T11:21:30,834][INFO ][o.e.c.c.Coordinator      ] [data1] cluster UUID [7UiHGr-tRNG2PPoJ6Po5Nw]
    [2019-07-15T11:21:31,105][INFO ][o.e.c.s.ClusterApplierService] [data1] master node changed {previous [], current [{master}{JlguofZdQvWhiaVGat8v6w}{ACoM2MKJRqSsHspgErAfUQ}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true}]}, added {{master}{JlguofZdQvWhiaVGat8v6w}{ACoM2MKJRqSsHspgErAfUQ}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true},}, term: 3, version: 24, reason: ApplyCommitRequest{term=3, version=24, sourceNode={master}{JlguofZdQvWhiaVGat8v6w}{ACoM2MKJRqSsHspgErAfUQ}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true}}
    [2019-07-15T11:21:31,316][INFO ][o.e.x.s.a.TokenService   ] [data1] refresh keys
    [2019-07-15T11:21:32,092][INFO ][o.e.x.s.a.TokenService   ] [data1] refreshed keys
    [2019-07-15T11:21:32,218][INFO ][o.e.l.LicenseService     ] [data1] license [f480cac7-d8d0-4b23-b727-513c307d8a3e] mode [basic] - valid
    [2019-07-15T11:21:32,249][INFO ][o.e.h.AbstractHttpServerTransport] [data1] publish_address {172.27.52.56:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}, {172.27.52.56:9200}
    [2019-07-15T11:21:32,250][INFO ][o.e.n.Node               ] [data1] started
    [2019-07-15T12:38:16,885][INFO ][o.e.c.s.ClusterApplierService] [data1] master node changed {previous [{master}{JlguofZdQvWhiaVGat8v6w}{ACoM2MKJRqSsHspgErAfUQ}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true}], current []}, term: 3, version: 24, reason: becoming candidate: onLeaderFailure
    [2019-07-15T12:38:16,906][WARN ][o.e.c.NodeConnectionsService] [data1] failed to connect to {master}{JlguofZdQvWhiaVGat8v6w}{ACoM2MKJRqSsHspgErAfUQ}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true} (tried [1] times)
    org.elasticsearch.transport.ConnectTransportException: [master][172.27.52.55:9300] connect_exception
            at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:972) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$3(ActionListener.java:161) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-7.2.0.jar:7.2.0]
            at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
            at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:883) ~[?:?]
            at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2322) ~[?:?]
            at org.elasticsearch.common.concurrent.CompletableContext.addListener(CompletableContext.java:45) ~[elasticsearch-core-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.netty4.Netty4TcpChannel.addConnectListener(Netty4TcpChannel.java:121) ~[?:?]
            at org.elasticsearch.transport.TcpTransport.initiateConnection(TcpTransport.java:299) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:266) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.ConnectionManager.internalOpenConnection(ConnectionManager.java:206) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.ConnectionManager.connectToNode(ConnectionManager.java:104) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:346) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:333) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.cluster.NodeConnectionsService$ConnectionTarget$1.doRun(NodeConnectionsService.java:304) [elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:758) [elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.2.0.jar:7.2.0]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
            at java.lang.Thread.run(Thread.java:835) [?:?]
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: 172.27.52.55/172.27.52.55:9300
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
        at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[?:?]
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:670) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:582) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:536) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        ... 1 more
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
        at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[?:?]
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:670) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:582) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:536) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        ... 1 more
[2019-07-15T12:38:26,878][WARN ][o.e.c.c.ClusterFormationFailureHelper] [data1] master not discovered yet: have discovered []; discovery will continue using [172.27.52.55:9300] from hosts providers and [{master}{JlguofZdQvWhiaVGat8v6w}{ACoM2MKJRqSsHspgErAfUQ}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true}, {data1}{b0WxAo8qQ-e9Oq6Nk1vRLQ}{HeQ4HjypRLWEkgDcTI59kg}{172.27.52.56}{172.27.52.56:9300}{ml.machine_memory=10314784768, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 3, last-accepted version 24 in term 3
[2019-07-15T12:38:30,981][INFO ][o.e.c.s.ClusterApplierService] [data1] master node changed {previous [], current [{master}{JlguofZdQvWhiaVGat8v6w}{U2Gh6B_ZRUWHIljz1TxnFw}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true}]}, removed {{master}{JlguofZdQvWhiaVGat8v6w}{ACoM2MKJRqSsHspgErAfUQ}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true},}, added {{master}{JlguofZdQvWhiaVGat8v6w}{U2Gh6B_ZRUWHIljz1TxnFw}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true},}, term: 4, version: 28, reason: ApplyCommitRequest{term=4, version=28, sourceNode={master}{JlguofZdQvWhiaVGat8v6w}{U2Gh6B_ZRUWHIljz1TxnFw}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true}}
[2019-07-15T12:38:31,005][INFO ][o.e.x.s.a.TokenService   ] [data1] refresh keys
[2019-07-15T12:38:31,775][INFO ][o.e.x.s.a.TokenService   ] [data1] refreshed keys

data2

    [root@elastic3 elasticsearch]# tail -f /var/log/elasticsearch/nms.log
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
            at java.lang.Thread.run(Thread.java:835) ~[?:?]
    [2019-07-15T05:41:13,850][INFO ][o.e.n.Node               ] [data2] stopping ...
    [2019-07-15T05:41:13,877][INFO ][o.e.x.w.WatcherService   ] [data2] stopping watch service, reason [shutdown initiated]
    [2019-07-15T05:41:13,997][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [data2] [controller/7197] [Main.cc@150] Ml controller exiting
    [2019-07-15T05:41:14,001][INFO ][o.e.x.m.p.NativeController] [data2] Native controller process has stopped - no new native processes can be started
    [2019-07-15T05:41:14,029][INFO ][o.e.n.Node               ] [data2] stopped
    [2019-07-15T05:41:14,029][INFO ][o.e.n.Node               ] [data2] closing ...
    [2019-07-15T05:41:14,057][INFO ][o.e.n.Node               ] [data2] closed
    [2019-07-15T05:41:18,518][INFO ][o.e.e.NodeEnvironment    ] [data2] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [43.4gb], net total_space [49.9gb], types [rootfs]
    [2019-07-15T05:41:18,524][INFO ][o.e.e.NodeEnvironment    ] [data2] heap size [989.8mb], compressed ordinary object pointers [true]
    [2019-07-15T05:41:18,550][INFO ][o.e.n.Node               ] [data2] node name [data2], node ID [0L8h9xYNToexfdx8MLLMAQ], cluster name [nms]
    [2019-07-15T05:41:18,551][INFO ][o.e.n.Node               ] [data2] version[7.2.0], pid[562], build[default/rpm/508c38a/2019-06-20T15:54:18.811730Z], OS[Linux/3.10.0-957.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12.0.1/12.0.1+12]
    [2019-07-15T05:41:18,551][INFO ][o.e.n.Node               ] [data2] JVM home [/usr/share/elasticsearch/jdk]
    [2019-07-15T05:41:18,552][INFO ][o.e.n.Node               ] [data2] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-7619731361125843351, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/var/lib/elasticsearch, -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -Dio.netty.allocator.type=unpooled, -XX:MaxDirectMemorySize=536870912, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/etc/elasticsearch, -Des.distribution.flavor=default, -Des.distribution.type=rpm, -Des.bundled_jdk=true]
....
    [2019-07-15T05:41:20,947][INFO ][o.e.p.PluginsService     ] [data2] no plugins loaded
    [2019-07-15T05:41:27,232][INFO ][o.e.x.s.a.s.FileRolesStore] [data2] parsed [0] roles from file [/etc/elasticsearch/roles.yml]
    [2019-07-15T05:41:28,153][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [data2] [controller/764] [Main.cc@110] controller (64 bit): Version 7.2.0 (Build 65aefcbfce449b) Copyright (c) 2019 Elasticsearch BV
    [2019-07-15T05:41:28,802][DEBUG][o.e.a.ActionModule       ] [data2] Using REST wrapper from plugin org.elasticsearch.xpack.security.Security
    [2019-07-15T05:41:29,428][INFO ][o.e.d.DiscoveryModule    ] [data2] using discovery type [zen] and seed hosts providers [settings]
    [2019-07-15T05:41:30,628][INFO ][o.e.n.Node               ] [data2] initialized
    [2019-07-15T05:41:30,629][INFO ][o.e.n.Node               ] [data2] starting ...
    [2019-07-15T05:41:30,846][INFO ][o.e.t.TransportService   ] [data2] publish_address {172.27.52.57:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}, {172.27.52.57:9300}
    [2019-07-15T05:41:30,859][INFO ][o.e.b.BootstrapChecks    ] [data2] bound or publishing to a non-loopback address, enforcing bootstrap checks
    [2019-07-15T05:41:30,978][INFO ][o.e.c.c.Coordinator      ] [data2] cluster UUID [Tx77PhoRTv6KUsXkON5DGA]
    [2019-07-15T05:41:31,320][INFO ][o.e.c.c.JoinHelper       ] [data2] failed to join {master}{JlguofZdQvWhiaVGat8v6w}{U2Gh6B_ZRUWHIljz1TxnFw}{172.27.52.55}{172.27.52.55:9300}{ml.machine_memory=14542614528, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={data2}{0L8h9xYNToexfdx8MLLMAQ}{4C9SyITHTbWKJlMV2MNUgA}{172.27.52.57}{172.27.52.57:9300}{ml.machine_memory=16656531456, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional.empty}
    org.elasticsearch.transport.RemoteTransportException: [master][172.27.52.55:9300][internal:cluster/coordination/join]
    Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
            at org.elasticsearch.cluster.coordination.Coordinator$3.onFailure(Coordinator.java:500) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.cluster.coordination.JoinHelper$5.handleException(JoinHelper.java:359) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1111) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:246) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688) ~[elasticsearch-7.2.0.jar:7.2.0]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
            at java.lang.Thread.run(Thread.java:835) [?:?]
    Caused by: org.elasticsearch.transport.RemoteTransportException: [data2][172.27.52.57:9300][internal:cluster/coordination/join/validate]
    Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 7UiHGr-tRNG2PPoJ6Po5Nw than local cluster uuid Tx77PhoRTv6KUsXkON5DGA, rejecting
            at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:147) ~[elasticsearch-7.2.0.jar:7.2.0]
            at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:250) ~[?:?]

at 3rd node data2 the error I get is

Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 7UiHGr-tRNG2PPoJ6Po5Nw than local cluster uuid Tx77PhoRTv6KUsXkON5DGA, rejecting

Thanks, and thanks for the better formatting too!

The log message shows that you have formed multiple clusters, and the link I shared above describes why this is a problem and what you should do to fix it:

If you intended to form a single cluster then you should start again:

  • Take a snapshot of each of the single-host clusters if you do not want to lose any data that they hold. Note that each cluster must use its own snapshot repository.
  • Shut down all the nodes.
  • Completely wipe each node by deleting the contents of their data folders.
  • Configure cluster.initial_master_nodes as described above.
  • Restart all the nodes and verify that they have formed a single cluster.
  • Restore any snapshots as required.

in order to make it clear

the -> cluster.initial_master_nodes: 172.27.52.55

do I have to put it also at data nodes or only at master node?

thank you

solved!

[root@elastic1 elasticsearch]# curl localhost:9200/_cat/nodes 
172.27.52.56 35 72  5 0.18 0.15 0.23 di - data1
172.27.52.57 34 53 10 0.24 0.14 0.23 di - data2
172.27.52.55 34 47  1 0.10 0.30 0.19 mi * master
[root@elastic1 elasticsearch]# curl localhost:9200/_cat/health 
1563214646 18:17:26 nms green 3 2 2 1 0 0 0 0 - 100.0%

each node (master or data) must declare the master node at elasticsearch.yml

(thank you DavidTurner)

Only on the master node(s). But all the nodes should have discovery.seed_hosts set.

no. this didn't work before.

I had to put the cluster.initial_master_nodes: 172.27.52.55 at all nodes

That tells us that something else was wrong. The cluster.initial_master_nodes setting has no effect on non-master nodes. It also has no effect on master nodes that have joined a cluster.

I will test it in a lab and i will let you know

1 Like

hi, one more question

I made node2 (data1) also a master server

[root@elastic1 elasticsearch]# curl localhost:9200/_cat/nodes?v
ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.27.52.55           11          52   3    0.64    0.19     0.10 mi        *      master
172.27.52.56           21          30   0    0.00    0.01     0.05 mdi       -      data1
172.27.52.57           31          55   7    0.86    0.66     0.44 di        -      data2

but when I stopped the elasticsearch at node1 (master - 172.27.52.55 ) i received the below error

[root@elastic2 elasticsearch]# curl localhost:9200/_cat/nodes?v
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}[root@elastic2 elasticsearch]# curl localhost:9200/_cat/nodes?v

why? since I mentioned at all nodes that I have two masters

cluster.initial_master_nodes: 172.27.52.56, 172.27.52.57

    [root@elastic2 elasticsearch]# cat elasticsearch.yml            
    # ======================== Elasticsearch Configuration =========================
    #
    # NOTE: Elasticsearch comes with reasonable defaults for most settings.
    #       Before you set out to tweak and tune the configuration, make sure you
    #       understand what are you trying to accomplish and the consequences.
    #
    # The primary way of configuring a node is via this file. This template lists
    # the most important settings you may want to configure for a production cluster.
    #
    # Please consult the documentation for further information on configuration options:
    # https://www.elastic.co/guide/en/elasticsearch/reference/index.html
    #
    # ---------------------------------- Cluster -----------------------------------
    #
    # Use a descriptive name for your cluster:
    #
    cluster.name: nms
    #
    # ------------------------------------ Node ------------------------------------
    #
    # Use a descriptive name for the node:
    #
    node.name: data1
    #
    # Add custom attributes to the node:
    #
    #node.attr.rack: r1
    cluster.initial_master_nodes: 172.27.52.56, 172.27.52.57
    node.master: true
    node.data: true
    #

thank you once more

If you want a fault-tolerant cluster you need at least three master-eligible nodes.