Elasticseach 7.x - Can't add new node to cluster

humartinez · September 3, 2019, 8:19pm

Hi there,
Im having an issue trying to add a new server to an existing cluster thats is using encription and Im getting the following error.

[2019-09-03T20:11:10,344][WARN ][o.e.c.c.ClusterFormationFailureHelper] [elastic-kibana] 
master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, 
and [cluster.initial_master_nodes] is empty on this node: have discovered [{elastic01} 
{Q77grcs6Q2uIbcrGllKcyQ}{YXvwWhRjT6OXOq1zhe8cBQ}{10.2.208.26}{10.2.208.26:9300} 
{ml.machine_memory=33742123008, ml.max_open_jobs=20, xpack.installed=true}, {elastic02} 
{FT0SSbtQQkOvIoh7qwzvYg}{ooFyk1dqQ36hNK7CojUABw}{10.2.208.27}{10.2.208.27:9300} 
{ml.machine_memory=33742123008, ml.max_open_jobs=20, xpack.installed=true}, {elastic03} 
{8vIBMx9ZTRq2USZ4n295ag}{sWsPggPHQcO_dE3P6zGXUA}{10.2.208.28}{10.2.208.28:9300} 
{ml.machine_memory=33742123008, ml.max_open_jobs=20, xpack.installed=true}]; discovery 
will continue using [10.2.208.26:9300, 10.2.208.27:9300, 10.2.208.28:9300] from hosts providers 
and [{elastic-kibana}{sWb4AwzaRPGWmxJYCo3Sgw}{svrZlwvJSeC2J2P6RYmRwQ}{10.4.28.35} 
{10.4.28.35:9300}{ml.machine_memory=8375558144, xpack.installed=true, 
ml.max_open_jobs=20}] from last-known cluster state; node term 18, last-accepted version 0 in 
term 0
[2019-09-03T20:11:14,786][INFO ][o.e.c.c.JoinHelper       ] [elastic-kibana] failed to join 
{elastic02}{FT0SSbtQQkOvIoh7qwzvYg}{ooFyk1dqQ36hNK7CojUABw}{10.2.208.27} 
{10.2.208.27:9300}{ml.machine_memory=33742123008, ml.max_open_jobs=20, 
xpack.installed=true} with JoinRequest{sourceNode={elastic-kibana} 
{sWb4AwzaRPGWmxJYCo3Sgw}{svrZlwvJSeC2J2P6RYmRwQ}{10.4.28.35}{10.4.28.35:9300} 
{ml.machine_memory=8375558144, xpack.installed=true, ml.max_open_jobs=20}, 
optionalJoin=Optional.empty}
org.elasticsearch.transport.RemoteTransportException: [elastic02][10.2.208.27:9300] 
[internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.ConnectTransportException: [elastic-kibana] 
[10.4.28.35:9300] connect_timeout[30s]
    at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1306) ~[elasticsearch-7.1.1.jar:7.1.1]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.1.1.jar:7.1.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
[2019-09-03T20:11:14,787][INFO ][o.e.c.c.JoinHelper       ] [elastic-kibana] failed to join 
{elastic02}{FT0SSbtQQkOvIoh7qwzvYg}{ooFyk1dqQ36hNK7CojUABw}{10.2.208.27} 
{10.2.208.27:9300}{ml.machine_memory=33742123008, ml.max_open_jobs=20, 
xpack.installed=true} with JoinRequest{sourceNode={elastic-kibana} 
{sWb4AwzaRPGWmxJYCo3Sgw}{svrZlwvJSeC2J2P6RYmRwQ}{10.4.28.35}{10.4.28.35:9300} 
{ml.machine_memory=8375558144, xpack.installed=true, ml.max_open_jobs=20}, 
optionalJoin=Optional.empty}
org.elasticsearch.transport.RemoteTransportException: [elastic02][10.2.208.27:9300] 
[internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.ConnectTransportException: [elastic-kibana][10.4.28.35:9300] connect_timeout[30s]
    at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1306) ~[elasticsearch-7.1.1.jar:7.1.1]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.1.1.jar:7.1.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

Thanks in advance

humartinez · September 4, 2019, 1:22pm

Somebody pls help?

humartinez · September 4, 2019, 3:48pm

I also find at the master this error

{"type": "server", "timestamp": "2019-09-04T15:21:48,425+0000", "level": "WARN", "component": 
"o.e.t.OutboundHandler", "cluster.name": "IPTV-Cluster", "node.name": "elastic02", 
"cluster.uuid": "VkWvT-1jSsaC6aXv_3GcJg", "node.id": "FT0SSbtQQkOvIoh7qwzvYg",  "message": 
"send message failed [channel: Netty4TcpChannel{localAddress=/10.2.208.27:9300, 
remoteAddress=/10.4.28.35:48390}]" ,
"stacktrace": ["java.nio.channels.ClosedChannelException: null",
"at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]"] }

please any help would be appreciated

Is there a way to increase the error level of this ?

elasticforme · September 4, 2019, 4:26pm

try adding new node with
#cluster.initial_master_nodes:

but then it might be because you said you using encryption.

humartinez · September 4, 2019, 4:41pm

Hi
I added this line

  cluster.initial_master_nodes: ["elastic01", "elastic02","elastic03"]

with the name of the rest of the nodes in the cluster but I still get the same error

I configured the key using this as reference Encrypting communications in Elasticsearch | Elasticsearch Guide [7.3] | Elastic

First I made the CA cert as this

bin/elasticsearch-certutil ca

and configured the first 3 nodes, now after a few weeks when I'm trying to add a new node (as coordinator to install kibana in there) Im unable to make it join the cluster

./bin/elasticsearch-certutil cert --ca /tmp/elastic-stack-ca.p12 --dns elastic-kibana --ip 1x.x.x.x --out /etc/elasticsearch/certs/elastic-kibana.p12

And I also made the cert for this host without dns and ip option but with same results.

There is one more thing when Im creating the cert and this are some warnings

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.bouncycastle.jcajce.provider.drbg.DRBG 
(file:/usr/share/elasticsearch/lib/tools/security-cli/bcprov-jdk15on-1.61.jar) to constructor 
sun.security.provider.Sun()
WARNING: Please consider reporting this to the maintainers of 
org.bouncycastle.jcajce.provider.drbg.DRBG
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access 
operations
WARNING: All illegal access operations will be denied in a future release
This tool assists you in the generation of X.509 certificates and certificate
signing requests for use with SSL/TLS in the Elastic stack.

pjanzen · September 4, 2019, 4:48pm

I would start with a basic telnet from the node you want to join to the nodes that are already there. On port 9200 and on 9300 if you get a connected then start looking at your config, but what I can tell from the logs the node who wants to join cannot connect to the existing nodes.

elasticforme · September 4, 2019, 5:05pm

did you made new key?
I thought you suppose to copy key from existing master to new node.

humartinez · September 4, 2019, 5:15pm

When I've issued this comand the --ca file I copied from one of the master nodes

./bin/elasticsearch-certutil cert --ca /tmp/elastic-stack-ca.p12 --dns elastic-kibana --ip 1x.x.x.x --out /etc/elasticsearch/certs/elastic-kibana.p12

elasticforme · September 4, 2019, 5:35pm

No, this is not what it says in document. and that is not what I have done

I created .p12 file on master server and copy that file to other master/data node with same permission and ownership

humartinez · September 4, 2019, 5:40pm

Sorry Im not following, cloud you make it a llitle bit more clearer?

elasticforme · September 4, 2019, 5:46pm

https://www.elastic.co/blog/getting-started-with-elasticsearch-security

Above document says create a .p12 key and copy that to all the servers
Step #3

humartinez · September 4, 2019, 6:54pm

I've rebuild the certs the cluster rejoin but the new host does not

[2019-09-04T18:52:09,115][WARN ][o.e.t.OutboundHandler    ] [elastic-kibana] send message 
failed [channel: Netty4TcpChannel{localAddress=/10.4.28.35:37204, 
remoteAddress=elastic01/10.2.208.26:9300}]
javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2019-09-04T18:52:09,122][WARN ][o.e.t.OutboundHandler    ] [elastic-kibana] send message 
failed [channel: Netty4TcpChannel{localAddress=/10.4.28.35:50948, 
remoteAddress=elastic02/10.2.208.27:9300}]
javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2019-09-04T18:52:09,122][WARN ][o.e.t.OutboundHandler    ] [elastic-kibana] send message 
failed [channel: Netty4TcpChannel{localAddress=/10.4.28.35:37780, 
remoteAddress=elastic03/10.2.208.28:9300}]
javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2019-09-04T18:52:09,127][WARN ][o.e.t.TcpTransport       ] [elastic-kibana] exception caught on 
transport layer [Netty4TcpChannel{localAddress=/10.4.28.35:37780, 
remoteAddress=elastic03/10.2.208.28:9300}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General 
SSLEngine problem
    at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~ 
[netty-codec-4.1.32.Final.jar:4.1.32.Final]
    at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

elasticforme · September 4, 2019, 7:02pm

I just added a node to my 3 node cluster

I made sure I have same uid/gid for elasticsearch/logstash/kibana
install all rpm
copy all config file from master
change name/ip on each config file
copy .p12 key from master
and started new node's elasticsearch.service and it join the cluster.

humartinez · September 4, 2019, 7:08pm

Sorry but I may be not doing the same as you do, but Im stick to the note that I've post in a previous message

./bin/elasticsearch-certutil ca
./bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic01 --ip 10.2.208.26 --out /tmp/certs/elastic01.p12
/bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic02 --ip 10.2.208.27 --out /tmp/certs/elastic02.p12
./bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic03 --ip 10.2.208.28 --out /tmp/certs/elastic03.p12
./bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic-kibana --ip 10.4.28.35 --out /tmp/certs/elastic-kibana.p12

root@elastic01:/tmp/certs# ls -ltr
total 16
-rw------- 1 root root 3475 Sep  4 18:45 elastic01.p12
-rw------- 1 root root 3475 Sep  4 18:45 elastic02.p12
-rw------- 1 root root 3475 Sep  4 18:46 elastic03.p12
-rw------- 1 root root 3483 Sep  4 18:46 elastic-kibana.p12

And then I've copied the files to /etc/elasticsearch/certs with elasticsearch:elasticsearch as owner

And as I told you the cluster has formed with the first 3 nodes

On the master node I see this

[2019-09-04T18:52:18,831][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [elastic02] client did 
not trust this server's certificate, closing connection 
Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/10.4.28.35:51000

elasticforme · September 5, 2019, 1:17pm

ok so if you have three .p12 file, one for each node.
then where is fourth file for forth node?

humartinez · September 5, 2019, 6:23pm

Its the last one elastic-kibana.p12

Finally Im seeing some light at the end of the tunnel I was having a issue with the security group, but then again Im still facing a problem

At the current master I see:

[2019-09-05T18:03:04,292][WARN ][o.e.t.TcpTransport       ] [elastic01] exception caught on transport layer [Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:58554, remoteAddress=10.4.28.35/10.4.28.35:9300}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
....
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
....
Caused by: java.security.cert.CertificateException: No subject alternative names present

At the "4th" node I see:

[2019-09-05T18:14:21,484][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [elastic-kibana] client did not trust this server's certificate, closing connection Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/10.2.208.26:60140}

In addition Im not understanding why the 3 firsts nodes have joined and this does not. I've created the certs in the same way for all the 4 nodes

system · October 3, 2019, 6:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch (Version 7.9.0) cannot add new elasticsearch nodes to join an existing cluster Elasticsearch	1	24	October 21, 2024
Error adding nodes to the cluster = Master not discovered yet: have discovered Elasticsearch	1	374	January 13, 2021
Master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and [cluster.initial_master_nodes] is empty on this node Elasticsearch	15	14774	May 24, 2019
Node addition failed Elasticsearch	34	1151	September 10, 2020
ElasticSeach cluster - Error while adding new node to cluster Elasticsearch	1	425	May 25, 2018

Elasticseach 7.x - Can't add new node to cluster

Related topics