Unknown error in second elasticsearch node

Harshet_Jain · July 9, 2019, 7:37am

I am getting the following error when I start the second node, any ideas?

Here is my settings for master and node 1:

Master:

bootstrap.memory_lock: false
cluster.name: panacea
http.port: 9200-9220
node.data: true
node.ingest: true
node.master: true
node.max_local_storage_nodes: 2
node.name: em
transport.tcp.port: 9300-9320
xpack.license.self_generated.type: basic
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
discovery.zen.ping.unicast.hosts: ["localhost:9200"]

Node 1:

bootstrap.memory_lock: false
cluster.name: panacea
http.port: 9230-9250
node.data: true
node.ingest: true
node.master: false
node.max_local_storage_nodes: 2
node.name: e2
transport.tcp.port: 9330-9350
xpack.license.self_generated.type: basic
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
discovery.zen.ping.unicast.hosts: ["localhost:9200"]

Error:

[2019-07-09T12:57:38,993][WARN ][o.e.t.TcpTransport       ] [e2] exception caugh
t on transport layer [Netty4TcpChannel{localAddress=0.0.0.0/0.0.0.0:56840, remot
eAddress=null}], closing connection
io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordExcept
ion: not an SSL/TLS record: 485454502f312e30203430302042616420526571756573740d0a
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageD
ecoder.java:472) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessage
Decoder.java:278) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst
ractChannelHandlerContext.java:374) [netty-transport-4.1.35.Final.jar:4.1.35.Fin
al]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst
ractChannelHandlerContext.java:360) [netty-transport-4.1.35.Final.jar:4.1.35.Fin
al]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra
ctChannelHandlerContext.java:352) [netty-transport-4.1.35.Final.jar:4.1.35.Final
]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(Defau
ltChannelPipeline.java:1408) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst
ractChannelHandlerContext.java:374) [netty-transport-4.1.35.Final.jar:4.1.35.Fin
al]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst
ractChannelHandlerContext.java:360) [netty-transport-4.1.35.Final.jar:4.1.35.Fin
al]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChanne
lPipeline.java:930) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(Abstra
ctNioByteChannel.java:163) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.jav
a:682) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLo
op.java:582) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.ja
va:536) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-t
ransport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThread
EventExecutor.java:906) [netty-common-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java
:74) [netty-common-4.1.35.Final.jar:4.1.35.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 48
5454502f312e30203430302042616420526571756573740d0a636f6e74656e742d747970653a2061
70706c69636174696f6e2f6a736f6e3b20636861727365743d5554462d380d0a636f6e74656e742d
28c38027c380227d2c22737461747573223a3430307d
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1
206) ~[netty-handler-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1274) ~[netty-
handler-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProte
ction(ByteToMessageDecoder.java:502) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final
]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageD
ecoder.java:441) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final]
        ... 16 more
[2019-07-09T12:57:39,321][INFO ][o.e.n.Node               ] [e2] stopping ...
[2019-07-09T12:57:39,321][INFO ][o.e.x.w.WatcherService   ] [e2] stopping watch
service, reason [shutdown initiated]
[2019-07-09T12:57:39,493][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [e2] [control
ler/64920] [Main.cc@150] Ml controller exiting
[2019-07-09T12:57:39,493][INFO ][o.e.x.m.p.NativeController] [e2] Native control
ler process has stopped - no new native processes can be started
[2019-07-09T12:57:39,571][INFO ][o.e.n.Node               ] [e2] stopped
[2019-07-09T12:57:39,571][INFO ][o.e.n.Node               ] [e2] closing ...
[2019-07-09T12:57:39,821][INFO ][o.e.h.AbstractHttpServerTransport] [e2] publish
_address {127.0.0.1:9230}, bound_addresses {127.0.0.1:9230}, {[::1]:9230}
[2019-07-09T12:57:39,821][INFO ][o.e.n.Node               ] [e2] started
[2019-07-09T12:57:39,868][INFO ][o.e.n.Node               ] [e2] closed

Harshet_Jain · July 10, 2019, 2:34am

I checked my computer logs; there is no program running at port 56840.

Bernt_Rostad · July 10, 2019, 4:23am

When you're faced with a cluster problem, especially if this is the first time you try to install one, it's smart to start simple - for instance by leaving TLS turned off (xpack.security.enabled: false) just to avoid the SSL errors which in your case is just confusing the issue.

Your problem, at least the first thing I noticed, is that you've configured the two nodes with different ranges of TCP ports. On your "em" node:

while on your "e2" node you have:

Because they don't have any TCP ports in common they will never be able to communicate and thus never form a cluster.

Why did you change these defaults?

If you just install Elasticsearch and leave the default TCP and HTTP ports you'll have a much better chance of getting the cluster to form. And when that succeeds you can go on and change the default settings, such as adding TLS.

Harshet_Jain · July 10, 2019, 4:59am

Thank you for your prompt help. I found a website that recommended these settings. I didnt know what they meant. As per your advice, I deleted the http.port and transport.tcp.port configuration. And after restarting both the nodes, they are still not discovering each other.

And I also changed security to xpack.security.enabled: false

The second node continues to give the following error:

[2019-07-10T10:27:48,999][WARN ][o.e.c.c.ClusterFormationFailureHelper] [e2] mas
ter not discovered yet: have discovered []; discovery will continue using [127.0
.0.1:9200, [::1]:9200] from hosts providers and [{e2}{ylyubB6FTpe7YYXOzqAvbA}{cB
U6ioffRjOOt_mXvnVafQ}{127.0.0.1}{127.0.0.1:9301}{ml.machine_memory=17179398144,
xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node
term 2, last-accepted version 18 in term 2

Bernt_Rostad · July 10, 2019, 5:13am

You are making progress but there are still issues with your configuration. I notice that you've set:

on both nodes. This is an important setting, it tells each node how to discover the master.

Obviously, the "e2" node can't connect to the master via localhost so you must edit this setting, to something like

discovery.zen.ping.unicast.hosts: ["<em.host>"]

where the <em.host> is the DNS name of the server running your "em" master node. You don't need to add the port number, just the host name. Since there is just one master eligible node you shouldn't have to change the setting on "em", just on "e2".

Also note that a cluster, in order to be fault tolerant and not at risk of getting into a split brain scenario should have at least 3 master eligible nodes. But for testing and learning how to configure a cluster you should be fine with the current setup. Good luck!

DavidTurner · July 10, 2019, 6:01am

This looks wrong, it should be port 9300. Port 9200 is normally a HTTP port.

Also since you're using version 7 you should be using discovery.seed_hosts instead of discovery.zen.ping.unicast.hosts. Both work, but the latter is deprecated.

This setting is deprecated and not recommended. Instead, give each node its own data path.

Harshet_Jain · July 10, 2019, 6:08am

Thanks, it now works. If two master nodes are there -- their networt.host setting will have the same or different IPs?

Bernt_Rostad · July 10, 2019, 6:15am

Excellent!

The network.host should be the IP (or host name) of the server on which the node is running, be it a master or a data node. See Network Settings for more info.

system · August 7, 2019, 6:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Errors on fresh install Elasticsearch	1	405	December 5, 2019
Xpack issues Elasticsearch	5	1075	March 6, 2018
Error In Elasticsearch After enable security on Window Elasticsearch	5	516	August 9, 2019
Elasticsearch: adding a second node to the cluster - [node-1] master not discovered yet: have discovered [{node-1} Elasticsearch	8	1350	August 24, 2020
Master node not trusting node certificate Elasticsearch elastic-stack-security	15	679	September 20, 2022

Unknown error in second elasticsearch node

Related topics