New 2.1.1 cluster forms with errors [SOLVED]


(Kawika Ohumukini) #1

Node 1: Master/Data
Node 2: Master/ Data
Node 3: Master

Node 1 & 2 form a cluster, have all imported data, queries and Marvel look great.
Bringing up Node 3, Node 1 & 2 cluster/health shows 3 nodes, logs look good. Marvel says 3 nodes but doesn't show Node 3 in the Node list.

Problem is Node 3 logs do not show cluster messages like master found, cluster/health times out and it's generally unusable. It kind of looks like split-brain but only Node 3 thinks it is not in a cluster.

Thanks for any ideas to try.


(Mark Walkom) #2

What do your configs look like?


(Kawika Ohumukini) #3

Here's what they look like. comments show difference between the three machines. - Thanks

node.name: "node1" # node2 and node3
network.host: 192.168.20.11 # .12 and .13
action.destructive_requires_name: true
bootstrap.mlockall: true
cluster.name: analytics
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.timeout: 15s
discovery.zen.ping.unicast.hosts: [ "192.168.20.11", "192.168.20.12", "192.168.20.13"]
gateway.expected_nodes: 2
gateway.recover_after_nodes: 1
gateway.recover_after_time: 5m
http.cors.enabled: true
index.cache.field.expire: 5m
index.number_of_replicas: 1
index.number_of_shards: 5
indices.fielddata.cache.size: 3GB
node.data: true # false for node3
node.master: true
path.data: "/mnt/ssd/elasticsearch"


(Kawika Ohumukini) #4

Turned on log debug level on node3 192.168.20.13. The following set of messages just repeats.

[2015-12-26 20:11:53,687][DEBUG][transport.netty ] [node3] connected to node [{#zen_unicast_4#}{192.168.20.11}{192.168.20.11:9300}]
[2015-12-26 20:11:53,687][DEBUG][transport.netty ] [node3] connected to node [{#zen_unicast_2#}{192.168.20.12}{192.168.20.12:9300}]

...15 second pause...

[2015-12-26 20:22:08,702][DEBUG][transport.netty ] [node3] disconnecting from [{#zen_unicast_4#}{192.168.20.11}{192.168.20.11:9300}] due to explicit disconnect call
[2015-12-26 20:22:08,702][DEBUG][discovery.zen ] [node3] filtered ping responses: (filter_client[true], filter_data[false])
--> ping_response{node [{node1}{jrPBHHpuQgqrYlJvA5Qhcg}{192.168.20.11}{192.168.20.11:9300}{master=true}], id[365], master [{node2}{5_5TIHtiRC-FvuqZzzYjTw}{192.168.20.12}{192.168.20.12:9300}{master=true}], hasJoinedOnce [true], cluster_name[analytics]}
--> ping_response{node [{node2}{5_5TIHtiRC-FvuqZzzYjTw}{192.168.20.12}{192.168.20.12:9300}{master=true}], id[364], master [{node2}{5_5TIHtiRC-FvuqZzzYjTw}{192.168.20.12}{192.168.20.12:9300}{master=true}], hasJoinedOnce [true], cluster_name[analytics]}
[2015-12-26 20:22:08,702][DEBUG][transport.netty ] [node3] disconnecting from [{#zen_unicast_2#}{192.168.20.12}{192.168.20.12:9300}] due to explicit disconnect call
[2015-12-26 20:22:08,707][DEBUG][discovery.zen.publish ] [node3] received diff for but don't have any local cluster state - requesting full state
[2015-12-26 20:22:08,812][DEBUG][cluster.service ] [node3] processing [finalize_join ({node2}{5_5TIHtiRC-FvuqZzzYjTw}{192.168.20.12}{192.168.20.12:9300}{master=true})]: execute
[2015-12-26 20:22:08,812][DEBUG][discovery.zen ] [node3] no master node is set, despite of join request completing. retrying pings.
[2015-12-26 20:22:08,812][DEBUG][cluster.service ] [node3] processing [finalize_join ({node2}{5_5TIHtiRC-FvuqZzzYjTw}{192.168.20.12}{192.168.20.12:9300}{master=true})]: took 0s no change in cluster_state
[2015-12-26 20:22:08,817][DEBUG][transport.netty ] [node3] connected to node [{#zen_unicast_2#}{192.168.20.12}{192.168.20.12:9300}]
[2015-12-26 20:22:08,818][DEBUG][transport.netty ] [node3] connected to node [{#zen_unicast_4#}{192.168.20.11}{192.168.20.11:9300}]


(Kawika Ohumukini) #5

I didn't install the license on the third node. Installed it and all is good.


(system) #6