Elasticseach 7.x - Can't add new node to cluster

Hi there,
Im having an issue trying to add a new server to an existing cluster thats is using encription and Im getting the following error.

[2019-09-03T20:11:10,344][WARN ][o.e.c.c.ClusterFormationFailureHelper] [elastic-kibana] 
master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, 
and [cluster.initial_master_nodes] is empty on this node: have discovered [{elastic01} 
{Q77grcs6Q2uIbcrGllKcyQ}{YXvwWhRjT6OXOq1zhe8cBQ}{10.2.208.26}{10.2.208.26:9300} 
{ml.machine_memory=33742123008, ml.max_open_jobs=20, xpack.installed=true}, {elastic02} 
{FT0SSbtQQkOvIoh7qwzvYg}{ooFyk1dqQ36hNK7CojUABw}{10.2.208.27}{10.2.208.27:9300} 
{ml.machine_memory=33742123008, ml.max_open_jobs=20, xpack.installed=true}, {elastic03} 
{8vIBMx9ZTRq2USZ4n295ag}{sWsPggPHQcO_dE3P6zGXUA}{10.2.208.28}{10.2.208.28:9300} 
{ml.machine_memory=33742123008, ml.max_open_jobs=20, xpack.installed=true}]; discovery 
will continue using [10.2.208.26:9300, 10.2.208.27:9300, 10.2.208.28:9300] from hosts providers 
and [{elastic-kibana}{sWb4AwzaRPGWmxJYCo3Sgw}{svrZlwvJSeC2J2P6RYmRwQ}{10.4.28.35} 
{10.4.28.35:9300}{ml.machine_memory=8375558144, xpack.installed=true, 
ml.max_open_jobs=20}] from last-known cluster state; node term 18, last-accepted version 0 in 
term 0
[2019-09-03T20:11:14,786][INFO ][o.e.c.c.JoinHelper       ] [elastic-kibana] failed to join 
{elastic02}{FT0SSbtQQkOvIoh7qwzvYg}{ooFyk1dqQ36hNK7CojUABw}{10.2.208.27} 
{10.2.208.27:9300}{ml.machine_memory=33742123008, ml.max_open_jobs=20, 
xpack.installed=true} with JoinRequest{sourceNode={elastic-kibana} 
{sWb4AwzaRPGWmxJYCo3Sgw}{svrZlwvJSeC2J2P6RYmRwQ}{10.4.28.35}{10.4.28.35:9300} 
{ml.machine_memory=8375558144, xpack.installed=true, ml.max_open_jobs=20}, 
optionalJoin=Optional.empty}
org.elasticsearch.transport.RemoteTransportException: [elastic02][10.2.208.27:9300] 
[internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.ConnectTransportException: [elastic-kibana] 
[10.4.28.35:9300] connect_timeout[30s]
    at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1306) ~[elasticsearch-7.1.1.jar:7.1.1]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.1.1.jar:7.1.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
[2019-09-03T20:11:14,787][INFO ][o.e.c.c.JoinHelper       ] [elastic-kibana] failed to join 
{elastic02}{FT0SSbtQQkOvIoh7qwzvYg}{ooFyk1dqQ36hNK7CojUABw}{10.2.208.27} 
{10.2.208.27:9300}{ml.machine_memory=33742123008, ml.max_open_jobs=20, 
xpack.installed=true} with JoinRequest{sourceNode={elastic-kibana} 
{sWb4AwzaRPGWmxJYCo3Sgw}{svrZlwvJSeC2J2P6RYmRwQ}{10.4.28.35}{10.4.28.35:9300} 
{ml.machine_memory=8375558144, xpack.installed=true, ml.max_open_jobs=20}, 
optionalJoin=Optional.empty}
org.elasticsearch.transport.RemoteTransportException: [elastic02][10.2.208.27:9300] 
[internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.ConnectTransportException: [elastic-kibana][10.4.28.35:9300] connect_timeout[30s]
    at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1306) ~[elasticsearch-7.1.1.jar:7.1.1]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) ~[elasticsearch-7.1.1.jar:7.1.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

Thanks in advance

Somebody pls help?

I also find at the master this error

{"type": "server", "timestamp": "2019-09-04T15:21:48,425+0000", "level": "WARN", "component": 
"o.e.t.OutboundHandler", "cluster.name": "IPTV-Cluster", "node.name": "elastic02", 
"cluster.uuid": "VkWvT-1jSsaC6aXv_3GcJg", "node.id": "FT0SSbtQQkOvIoh7qwzvYg",  "message": 
"send message failed [channel: Netty4TcpChannel{localAddress=/10.2.208.27:9300, 
remoteAddress=/10.4.28.35:48390}]" ,
"stacktrace": ["java.nio.channels.ClosedChannelException: null",
"at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[?:?]"] }

please any help would be appreciated

Is there a way to increase the error level of this ?

try adding new node with
#cluster.initial_master_nodes:

but then it might be because you said you using encryption.

Hi
I added this line

  cluster.initial_master_nodes: ["elastic01", "elastic02","elastic03"]

with the name of the rest of the nodes in the cluster but I still get the same error

I configured the key using this as reference https://www.elastic.co/guide/en/elasticsearch/reference/7.3/configuring-tls.html

First I made the CA cert as this

bin/elasticsearch-certutil ca

and configured the first 3 nodes, now after a few weeks when I'm trying to add a new node (as coordinator to install kibana in there) Im unable to make it join the cluster

./bin/elasticsearch-certutil cert --ca /tmp/elastic-stack-ca.p12 --dns elastic-kibana --ip 1x.x.x.x --out /etc/elasticsearch/certs/elastic-kibana.p12

And I also made the cert for this host without dns and ip option but with same results.

There is one more thing when Im creating the cert and this are some warnings

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.bouncycastle.jcajce.provider.drbg.DRBG 
(file:/usr/share/elasticsearch/lib/tools/security-cli/bcprov-jdk15on-1.61.jar) to constructor 
sun.security.provider.Sun()
WARNING: Please consider reporting this to the maintainers of 
org.bouncycastle.jcajce.provider.drbg.DRBG
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access 
operations
WARNING: All illegal access operations will be denied in a future release
This tool assists you in the generation of X.509 certificates and certificate
signing requests for use with SSL/TLS in the Elastic stack.

I would start with a basic telnet from the node you want to join to the nodes that are already there. On port 9200 and on 9300 if you get a connected then start looking at your config, but what I can tell from the logs the node who wants to join cannot connect to the existing nodes.

did you made new key?
I thought you suppose to copy key from existing master to new node.

When I've issued this comand the --ca file I copied from one of the master nodes

./bin/elasticsearch-certutil cert --ca /tmp/elastic-stack-ca.p12 --dns elastic-kibana --ip 1x.x.x.x --out /etc/elasticsearch/certs/elastic-kibana.p12

No, this is not what it says in document. and that is not what I have done

I created .p12 file on master server and copy that file to other master/data node with same permission and ownership

Sorry Im not following, cloud you make it a llitle bit more clearer?

https://www.elastic.co/blog/getting-started-with-elasticsearch-security

Above document says create a .p12 key and copy that to all the servers
Step #3

I've rebuild the certs the cluster rejoin but the new host does not

[2019-09-04T18:52:09,115][WARN ][o.e.t.OutboundHandler    ] [elastic-kibana] send message 
failed [channel: Netty4TcpChannel{localAddress=/10.4.28.35:37204, 
remoteAddress=elastic01/10.2.208.26:9300}]
javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2019-09-04T18:52:09,122][WARN ][o.e.t.OutboundHandler    ] [elastic-kibana] send message 
failed [channel: Netty4TcpChannel{localAddress=/10.4.28.35:50948, 
remoteAddress=elastic02/10.2.208.27:9300}]
javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2019-09-04T18:52:09,122][WARN ][o.e.t.OutboundHandler    ] [elastic-kibana] send message 
failed [channel: Netty4TcpChannel{localAddress=/10.4.28.35:37780, 
remoteAddress=elastic03/10.2.208.28:9300}]
javax.net.ssl.SSLException: SSLEngine closed already
    at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[?:?]
[2019-09-04T18:52:09,127][WARN ][o.e.t.TcpTransport       ] [elastic-kibana] exception caught on 
transport layer [Netty4TcpChannel{localAddress=/10.4.28.35:37780, 
remoteAddress=elastic03/10.2.208.28:9300}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General 
SSLEngine problem
    at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~ 
[netty-codec-4.1.32.Final.jar:4.1.32.Final]
    at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-codec-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:656) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:556) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:510) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:470) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [netty-common-4.1.32.Final.jar:4.1.32.Final]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

I just added a node to my 3 node cluster

I made sure I have same uid/gid for elasticsearch/logstash/kibana
install all rpm
copy all config file from master
change name/ip on each config file
copy .p12 key from master
and started new node's elasticsearch.service and it join the cluster.

Sorry but I may be not doing the same as you do, but Im stick to the note that I've post in a previous message

./bin/elasticsearch-certutil ca
./bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic01 --ip 10.2.208.26 --out /tmp/certs/elastic01.p12
/bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic02 --ip 10.2.208.27 --out /tmp/certs/elastic02.p12
./bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic03 --ip 10.2.208.28 --out /tmp/certs/elastic03.p12
./bin/elasticsearch-certutil cert --ca /root/elk-stack-ca.p12 --dns elastic-kibana --ip 10.4.28.35 --out /tmp/certs/elastic-kibana.p12

root@elastic01:/tmp/certs# ls -ltr
total 16
-rw------- 1 root root 3475 Sep  4 18:45 elastic01.p12
-rw------- 1 root root 3475 Sep  4 18:45 elastic02.p12
-rw------- 1 root root 3475 Sep  4 18:46 elastic03.p12
-rw------- 1 root root 3483 Sep  4 18:46 elastic-kibana.p12

And then I've copied the files to /etc/elasticsearch/certs with elasticsearch:elasticsearch as owner

And as I told you the cluster has formed with the first 3 nodes

On the master node I see this

[2019-09-04T18:52:18,831][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [elastic02] client did 
not trust this server's certificate, closing connection 
Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/10.4.28.35:51000

ok so if you have three .p12 file, one for each node.
then where is fourth file for forth node?

Its the last one elastic-kibana.p12

Finally Im seeing some light at the end of the tunnel I was having a issue with the security group, but then again Im still facing a problem

At the current master I see:

[2019-09-05T18:03:04,292][WARN ][o.e.t.TcpTransport       ] [elastic01] exception caught on transport layer [Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:58554, remoteAddress=10.4.28.35/10.4.28.35:9300}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
....
Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
....
Caused by: java.security.cert.CertificateException: No subject alternative names present

At the "4th" node I see:

[2019-09-05T18:14:21,484][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [elastic-kibana] client did not trust this server's certificate, closing connection Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:9300, remoteAddress=/10.2.208.26:60140}

In addition Im not understanding why the 3 firsts nodes have joined and this does not. I've created the certs in the same way for all the 4 nodes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.