Unable to Add Data Node to Elasticsearch Cluster

Hi, I'm having trouble adding another data node to my Master Elasticsearch. Running 7.16.1

Here's the Elasticsearch.yml from the Master:

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: elk-qa
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: master-data-1
#
# Add custom attributes to the node:
#
node.roles: [master, data, ingest]
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
network.host: [_local_, _site_, _ens4_ ]
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
http.port: 9200
#
transport.tcp.port: 9300
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["10.182.0.5", "10.182.0.6"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["10.182.0.5"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
discovery.zen.hosts_provider: file
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
#discovery.type: single-node
#
# Enable security
xpack.security.enabled: true
# Enable auditing if you want, uncomment
# xpack.security.audit.enabled: true
# SSL HTTP Settings
#xpack.security.http.ssl.enabled: true
#xpack.security.http.ssl.keystore.path: http.p12
# SSL Transport Settings
xpack.security.transport.ssl.enabled: true
#xpack.security.transport.ssl.verification_mode: certificate
#xpack.security.transport.ssl.client_authentication: required
#xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
#xpack.security.transport.ssl.truststore.path: elastic-certificates.p12

And here is the Elasticsearch.yml for Data-Node-1:

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: elk-qa
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: data-node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
node.roles: [ data, ingest ]
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /var/lib/elasticsearch
#
# Path to log files:
#
path.logs: /var/log/elasticsearch
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
network.host: [_local_, _site_, _ens4_ ]
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
http.port: 9200
#
transport.tcp.port: 9300
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["10.182.0.5", "10.182.0.6"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["master-data-1"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
#discovery.type: single-node

The 2 machines can see each other on Port 9200 when I use Curl, but not port 9300

when I do curl -XGET 'http://localhost:9200/_cluster/state?pretty' I get the following output:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "master_not_discovered_exception",
        "reason" : null
      }
    ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503

Port 9200 and 9300 are open on both machines

Welcome to our community! :smiley:

Please provide the entire Elasticsearch log from startup.

Hi Warkolm, thanks fro your reply. This is the logs I see in the data node:

2022-01-03T23:06:00,557][WARN ][o.e.c.c.ClusterFormationFailureHelper] [data-node-1] master not discovered yet: ha
ve discovered [{data-node-1}{Ia-oXn2zTqquycYzZX934A}{_PqAuBFqRNWOb55fe5vBSg}{10.182.0.6}{10.182.0.6:9300}{di}]; dis
covery will continue using [10.182.0.5:9300] from hosts providers and [] from last-known cluster state; node term 0
, last-accepted version 0 in term 0
[2022-01-03T23:06:01,358][WARN ][o.e.d.PeerFinder         ] [data-node-1] address [10.182.0.5:9300], node [null], r
equesting [false] connection failed: [][10.182.0.5:9300] general node connection failure: handshake failed because 
connection reset
[2022-01-03T23:06:02,359][WARN ][o.e.d.PeerFinder         ] [data-node-1] address [10.182.0.5:9300], node [null], r
equesting [false] connection failed: [][10.182.0.5:9300] general node connection failure: handshake failed because 
connection reset
[2022-01-03T23:06:03,359][WARN ][o.e.d.PeerFinder         ] [data-node-1] address [10.182.0.5:9300], node [null], r
equesting [false] connection failed: [][10.182.0.5:9300] general node connection failure: handshake failed because 
connection reset
[2022-01-03T23:06:04,360][WARN ][o.e.d.PeerFinder         ] [data-node-1] address [10.182.0.5:9300], node [null], r
equesting [false] connection failed: [][10.182.0.5:9300] general node connection failure: handshake failed because 
connection reset
[2022-01-03T23:06:05,360][WARN ][o.e.d.PeerFinder         ] [data-node-1] address [10.182.0.5:9300], node [null], r
equesting [false] connection failed: [][10.182.0.5:9300] general node connection failure: handshake failed because 
connection reset
[2022-01-03T23:06:06,365][WARN ][o.e.d.PeerFinder         ] [data-node-1] address [10.182.0.5:9300], node [null], r
equesting [false] connection failed: [][10.182.0.5:9300] general node connection failure: handshake failed because 
connection reset

And here's my code from the master-node:

[2022-01-03T23:09:37,637][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [master-data-1] received plaintext traffic 
on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.182.0.5:9300, remoteAddress=/10.182.0
.6:56632, profile=default}
[2022-01-03T23:09:38,638][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [master-data-1] received plaintext traffic 
on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.182.0.5:9300, remoteAddress=/10.182.0
.6:56634, profile=default}
[2022-01-03T23:09:39,638][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [master-data-1] received plaintext traffic 
on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.182.0.5:9300, remoteAddress=/10.182.0
.6:56636, profile=default}
[2022-01-03T23:09:40,639][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [master-data-1] received plaintext traffic 
on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.182.0.5:9300, remoteAddress=/10.182.0
.6:56638, profile=default}
[2022-01-03T23:09:41,643][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [master-data-1] received plaintext traffic 
on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.182.0.5:9300, remoteAddress=/10.182.0
.6:56640, profile=default}
[2022-01-03T23:09:42,644][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [master-data-1] received plaintext traffic 
on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.182.0.5:9300, remoteAddress=/10.182.0
.6:56642, profile=default}

So I think I know my issue, as I haven't set up x-pack security on the data node yet. My question is, should I create the same passwords for the users as I have on the master node?

Thanks,

Jeff

Your data node is trying to communicate using plaintext while the master is waiting the connection to be encrypted.

[2022-01-03T23:09:42,644][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [master-data-1] received plaintext traffic
on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.182.0.5:9300, remoteAddress=/10.182.0
.6:56642, profile=default}

Looking at the configuration files that you shared, your master node has the following options enabled:

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true

The second one will make your nodes only talk to each other using an encrypted connection, you need to put those same configurations in your data node.

Do I need to set up the elastic user passwords on the data node as well, and should they match the passwords that are already created on the master node?

No, you only need to setup elastic user passwords once, since you already did it, there is no need to run the commands again.

It's still not connecting, but is it a certificate issue?

        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:583) [netty-transport-4.1.66.Fin
al.jar:4.1.66.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.66.Final.jar:4.1.66.Fi
nal]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-comm
on-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.66.Final.jar
:4.1.66.Final]
        at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: javax.net.ssl.SSLHandshakeException: No available authentication scheme
        at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]
        at sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?]
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:357) ~[?:?]
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:313) ~[?:?]
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:304) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateProducer.onProduceCertificate(CertificateMessage.java:
972) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateProducer.produce(CertificateMessage.java:961) ~[?:?]
        at sun.security.ssl.SSLHandshake.produce(SSLHandshake.java:440) ~[?:?]
        at sun.security.ssl.ClientHello$T13ClientHelloConsumer.goServerHello(ClientHello.java:1246) ~[?:?]
        at sun.security.ssl.ClientHello$T13ClientHelloConsumer.consume(ClientHello.java:1182) ~[?:?]
        at sun.security.ssl.ClientHello$ClientHelloConsumer.onClientHello(ClientHello.java:840) ~[?:?]
        at sun.security.ssl.ClientHello$ClientHelloConsumer.consume(ClientHello.java:801) ~[?:?]
        at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396) ~[?:?]
        at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264) ~[?:?]
        at java.security.AccessController.doPrivileged(AccessController.java:712) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209) ~[?:?]
        at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1550) ~[netty-handler-4.1.66.Final.jar
:4.1.66.Final]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1396) ~[netty-handler-4.1.66.Final.jar:4.1.66.Fin
al]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1237) ~[netty-handler-4.1.66.Final.j
ar:4.1.66.Final]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1286) ~[netty-handler-4.1.66.Final.jar:4.1.66.Fin
al]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507
) ~[netty-codec-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446) ~[netty-codec-4.1.
66.Final.jar:4.1.66.Final]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.