Exception when adding a second node to a cluster


(Ed Spencer) #1

Hi All,

I hope this is the right place for help...

I'm trying to setup a clustered ElasticSearch install across two separate servers.

From what I understand, my ElasticSearch configs look ok.

Node 1 is setup as being the master, and the cluster name is the same in both configs

Node 1 config:

cluster.name: ProdCluster
node.name: "Web 01"
node.master: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["PROD-WEB01:9200", "PROD-WEB02:9200"]

Node 2 config:

cluster.name: ProdCluster
node.name: "Web 02"
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["PROD-WEB01:9200", "PROD-WEB02:9200"]

Both ElasticSearch instances show that they are the only node in the ProdCluster.

Here's where it gets interesting. Looking through the logs on Web01, it looks like it actually gets some communication from web02, but crashes. Here is the Web01 log:

[2015-07-30 16:20:20,198][INFO ][cluster.service          ] [Web 01] new_master [Web 01][iP4lY2WaQAGkmhB1yHmrpQ][FF-PROD-WEB01][inet[/10.188.71.33:9300]]{master=true}, reason: zen-disco-join (elected_as_master)
[2015-07-30 16:20:20,245][INFO ][gateway                  ] [Web 01] recovered [0] indices into cluster_state
[2015-07-30 16:20:20,385][INFO ][http                     ] [Web 01] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.188.71.33:9200]}
[2015-07-30 16:20:20,385][INFO ][node                     ] [Web 01] started
[2015-07-30 16:20:21,462][WARN ][http.netty               ] [Web 01] Caught exception while handling client http traffic, closing connection [id: 0xf2ae1630, /10.188.71.34:51611 => /10.188.71.33:9200]
java.lang.IllegalArgumentException: invalid version format: 10.188.71.34
	at org.elasticsearch.common.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:94)
	at org.elasticsearch.common.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:62)
	at org.elasticsearch.common.netty.handler.codec.http.HttpRequestDecoder.createMessage(HttpRequestDecoder.java:75)
	at org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:191)
	at org.elasticsearch.common.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:102)
	at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
	at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
	at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
	at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
	at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
	at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
	at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)

(10.188.71.34 is the private ip address of Web02)

And the log for Web02, which seems to indicate that it becomes the master after not getting an expected response from Web01:

[2015-07-30 16:20:25,540][INFO ][cluster.service          ] [Web 02] new_master [Web 02][egm3PZBBSoaVJIexy_Obmw][FF-PROD-WEB02][inet[/10.188.71.34:9300]], reason: zen-disco-join (elected_as_master)

Setup on both servers:
Windows Server x64
ElasticSearch x64: 1.6.0
Java x64: jre1.8.0_51

Has anyone got any ideas or tips? Anything would be much appreciated. Many Thanks


(Christian Dahlqvist) #2

Port 9200 is the default port for HTTP traffic. You should set it to 9300 in the unicast node specification in order to allow the nodes to link up and discover eachother.


(Ed Spencer) #3

Christian, you are a gentleman.

Thank you very much. We did actually open that port between the boxes on that port as well.

Working perfectly now! Thanks so much.


(system) #4