Setting up 3 Nodes Cluster in Single Machine

Hi,
I am trying to setup 3 nodes cluster in my computer with 3 different paths
Here is the elasticsearch configuration

Node 1 :

cluster.name: newcluster
node.name: node-4
path.data: C:\ELK Node1\elasticsearch-7.2.1-windows-x86_64\data
path.logs: C:\ELK Node1\elasticsearch-7.2.1-windows-x86_64\logs
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["localhost:9200","localhost:9201","localhost:9202"]
cluster.initial_master_nodes: ["node-4", "node-5","node-6"]
node.data : true
node.master : true
xpack.ml.enabled: false
xpack.security.enabled : false

In Node2 and Node4 configuration i have chanaged node names and port number rest all is same as Node1

When i start these nodes i am getting ClusterFormException : Below is the log data

2019-08-26T22:52:27,177][INFO ][o.e.h.AbstractHttpServerTransport] [node-4] publish_address {10.212.248.33:9200}, bound_addresses {[::]:9200}
[2019-08-26T22:52:27,178][INFO ][o.e.n.Node               ] [node-4] started
[2019-08-26T22:52:32,747][WARN ][o.e.c.c.ClusterFormationFailureHelper] [node-4] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [node-4, node-5, node-6] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9200, [::1]:9200, 127.0.0.1:9201, [::1]:9201, 127.0.0.1:9202, [::1]:9202] from hosts providers and [{node-4}{gAQa48mmT4-GSBjO5RC3NA}{26tS2-C4QlqiSy8ZgO7c3w}{10.212.248.33}{10.212.248.33:9300}{xpack.installed=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2019-08-26T22:52:34,837][WARN ][o.e.t.TcpTransport       ] [node-4] exception caught on transport layer [Netty4TcpChannel{localAddress=/0:0:0:0:0:0:0:1:52704, remoteAddress=localhost/0:0:0:0:0:0:0:1:9200}], closing connection
io.netty.handler.codec.DecoderException: java.io.StreamCorruptedException: invalid internal transport message format, got (48,54,54,50)
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:472) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:278) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1408) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:682) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:582) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:536) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906) [netty-common-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.35.Final.jar:4.1.35.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
Caused by: java.io.StreamCorruptedException: invalid internal transport message format, got (48,54,54,50)
        at org.elasticsearch.transport.TcpTransport.readHeaderBuffer(TcpTransport.java:745) ~[elasticsearch-7.2.1.jar:7.2.1]
        at org.elasticsearch.transport.TcpTransport.readMessageLength(TcpTransport.java:731) ~[elasticsearch-7.2.1.jar:7.2.1]
        at org.elasticsearch.transport.netty4.Netty4SizeHeaderFrameDecoder.decode(Netty4SizeHeaderFrameDecoder.java:40) ~[transport-netty4-client-7.2.1.jar:7.2.1]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:502) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:441) ~[netty-codec-4.1.35.Final.jar:4.1.35.Final]

I am really not understanding ,why these nodes are failing to communicate with other and join to the cluster. I changed node names,changed cluster name,deleted previously created data folders from all 3 instances and Restarted each nodes several times . But nothing is bringing luck. Seeking help !
indent preformatted text by 4 spaces

This is first time I saw this kind of configuration
it is one physical machine and one physical installation.
This is how it should be if you have three host

discovery.seed_hosts: ["node1:9200","node2:9200","node3:9200"]

You can test connection between nodes? (telnet to each node using ports 9200 and 9300)

The issue is that you're using port 9200 (and nearby ports 9201 and 9202) for discovery. Ports numbers 92xx are normally for HTTP traffic from the outside world, but node-to-node traffic like discovery uses the transport protocol on port 9300 (and 9301 and 9302 and so on). Fix the port numbers in your discovery config and you should be away.

@DavidTurner,

 Hi David, I changed port numbers , still getting the same error

discovery.seed_hosts: ["127.0.0.1:9300","127.0.0.1:9301","127.0.0.1:9302"]
#discovery.seed_hosts: ["localhost"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["esfirstnode","essecondnode","esthirdnode"]

node.data : true
node.master : true
#elasticsearch



 > [2019-08-27T11:38:34,808][INFO ][o.e.t.TransportService   ] [esfirstnode] publish_address {10.51.134.136:9300}, bound_addresses {[::]:9300}
[2019-08-27T11:38:34,821][INFO ][o.e.b.BootstrapChecks    ] [esfirstnode] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-08-27T11:38:44,852][WARN ][o.e.c.c.ClusterFormationFailureHelper] [esfirstnode] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [esfirstnode, essecondnode, esthirdnode] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302] from hosts providers and [{esfirstnode}{L1h3S_rYR9Cnn33Pso0Y-w}{WPD8sUVYSb2FZkOyo7U-2A}{10.51.134.136}{10.51.134.136:9300}{xpack.installed=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2019-08-27T11:38:54,859][WARN ][o.e.c.c.ClusterFormationFailureHelper] [esfirstnode] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [esfirstnode, essecondnode, esthirdnode] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302] from hosts providers and [{esfirstnode}{L1h3S_rYR9Cnn33Pso0Y-w}{WPD8sUVYSb2FZkOyo7U-2A}{10.51.134.136}{10.51.134.136:9300}{xpack.installed=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2019-08-27T11:39:04,872][WARN ][o.e.n.Node               ] [esfirstnode] timed out while waiting for initial discovery state - timeout: 30s
[2019-08-27T11:39:53,888][INFO ][o.e.h.AbstractHttpServerTransport] [esfirstnode] publish_address {10.51.134.136:9200}, bound_addresses {[::]:9200}
[2019-08-27T11:39:53,889][INFO ][o.e.n.Node               ] [esfirstnode] started
[2019-08-27T11:40:03,602][WARN ][o.e.c.c.ClusterFormationFailureHelper] [esfirstnode] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [esfirstnode, essecondnode, esthirdnode] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302] from hosts providers and [{esfirstnode}{L1h3S_rYR9Cnn33Pso0Y-w}{WPD8sUVYSb2FZkOyo7U-2A}{10.51.134.136}{10.51.134.136:9300}{xpack.installed=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2019-08-27T11:40:14,532][WARN ][o.e.c.c.ClusterFormationFailureHelper] [esfirstnode] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [esfirstnode, essecondnode, esthirdnode] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302] from hosts providers and [{esfirstnode}{L1h3S_rYR9Cnn33Pso0Y-w}{WPD8sUVYSb2FZkOyo7U-2A}{10.51.134.136}{10.51.134.136:9300}{xpack.installed=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2019-08-27T11:40:24,536][WARN ][o.e.c.c.ClusterFormationFailureHelper] [esfirstnode] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [esfirstnode, essecondnode, esthirdnode] to bootstrap a cluster: have discovered []; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302] from hosts providers and [{esfirstnode}{L1h3S_rYR9Cnn33Pso0Y-w}{WPD8sUVYSb2FZkOyo7U-2A}{10.51.134.136}{10.51.134.136:9300}{xpack.installed=true}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

Can you share the complete logs from every node from when they start up until when they first log master not discovered yet?

Hi David ,
There is a character limit here, how can i provide all 3 log files

Using https://pastebin.com/

1 Like

@DavidTurner,
Hi David , Files are in github , Please look into them

There is an inconsistency in your nodes' configs:

[2019-08-28T12:54:04,193][INFO ][o.e.t.TransportService   ] [node-1] publish_address {10.51.143.92:9300}, bound_addresses {[::]:9300}
[2019-08-28T12:55:54,067][INFO ][o.e.t.TransportService   ] [node-2] publish_address {10.51.143.92:9301}, bound_addresses {[::]:9301}
[2019-08-28T13:13:35,626][INFO ][o.e.t.TransportService   ] [node-3] publish_address {127.0.0.1:9302}, bound_addresses {127.0.0.1:9302}, {[::1]:9302}

It looks like node-3 does not have network.host: 0.0.0.0 but the other two do?

It also looks like there is some kind of network config problem. The nodes are listening on [::] which normally means "all addresses", but it looks like node-3 cannot connect to node-2:

[2019-08-28T13:14:06,350][INFO ][o.e.c.c.JoinHelper       ] [node-3] failed to join {node-2}{HF0Fvfi1QHigdG6OkWWJSg}{Zy6apEPYRtaRyebkrsiWpw}{10.51.143.92}{10.51.143.92:9301}{ml.machine_memory=8462733312, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={node-3}{_2_6EmT3T-2CgVCity1lQw}{K8NFLcmkRcmzjA9FtQdFog}{127.0.0.1}{127.0.0.1:9302}{ml.machine_memory=8462733312, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=26, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={node-3}{_2_6EmT3T-2CgVCity1lQw}{K8NFLcmkRcmzjA9FtQdFog}{127.0.0.1}{127.0.0.1:9302}{ml.machine_memory=8462733312, xpack.installed=true, ml.max_open_jobs=20}, targetNode={node-2}{HF0Fvfi1QHigdG6OkWWJSg}{Zy6apEPYRtaRyebkrsiWpw}{10.51.143.92}{10.51.143.92:9301}{ml.machine_memory=8462733312, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.NodeNotConnectedException: [node-2][10.51.143.92:9301] Node not connected
	at org.elasticsearch.transport.ConnectionManager.getConnection(ConnectionManager.java:151) ~[elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.transport.TransportService.getConnection(TransportService.java:568) ~[elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:540) ~[elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.cluster.coordination.JoinHelper.sendJoinRequest(JoinHelper.java:278) ~[elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.cluster.coordination.JoinHelper.sendJoinRequest(JoinHelper.java:211) ~[elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$2(JoinHelper.java:135) ~[elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:250) [x-pack-security-7.2.1.jar:7.2.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:308) [x-pack-security-7.2.1.jar:7.2.1]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:267) [elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:758) [elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.2.1.jar:7.2.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

Neither of those fully explain the problem you're seeing. Can you set logger.org.elasticsearch.discovery: TRACE on all the nodes and then start them all up again so we can see some more detail about the discovery process that's failing?

@DavidTurner,

Hi David, i posted new log files with the updated configuration. Now node1 and node2 are able to form a cluster , but node3 is not joining them. please check the log files

https://github.com/ramya397/Elasticsearch-Cluster-Setup

Ok, the discovery process looks ok now, but the joining process is timing out:

[2019-08-29T12:36:53,123][INFO ][o.e.c.c.JoinHelper       ] [node-3] last failed join attempt was 6.3s ago, failed to join {node-1}{eJcfg55eTCqlbwMXf_txeA}{Ue-1FvJLQ3qtXGE4f6ufow}{localhost}{127.0.0.1:9300}{ml.machine_memory=8462733312, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={node-3}{bBDKwl91TSi1qQINa0Uz-Q}{1UftlvqQTeiAprhofOaylg}{localhost}{127.0.0.1:9302}{ml.machine_memory=8462733312, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional.empty}
org.elasticsearch.transport.ReceiveTimeoutTransportException: [node-1][127.0.0.1:9300][internal:cluster/coordination/join] request_id [22823] timed out after [59935ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1013) ~[elasticsearch-7.2.1.jar:7.2.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688) ~[elasticsearch-7.2.1.jar:7.2.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_211]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_211]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]

That's quite surprising. Can you try again with logger.org.elasticsearch.cluster.coordination: TRACE on all the nodes? Ideally:

  • shut all the nodes down
  • set this setting
  • clear out the log files
  • start all the nodes up at about the same time
  • wait for node-3 to log master not discovered or elected yet and then capture the logs

It should only take a few minutes to do this. There's no need for hours and hours of logs.

Sure , give me 5 minutes i will provide you the log files

@DavidTurner,

Hi David, now the cluster is formed successfully :slight_smile: , but i didn't understand what was the issue before.
Could you please help me out in this to understand. i have posted newly generated logs and updated configuration in github. https://github.com/ramya397/Elasticsearch-Cluster-Setup

Output of http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1 19 81 -1 mdi * node-2
127.0.0.1 25 81 -1 mdi - node-3
127.0.0.1 32 81 -1 mdi - node-1

Sorry, I don't really know why node-3's joining was repeatedly timing out. There's nothing in the logs to indicate why that might have been. It looks healthy now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.