ES upgrade 1.6.2 to 2.1.0 doesn't bind to desired address

(Dylan Humphreys) #1

Hi Everyone,
we're currently running elasticsearch 1.7.3 and wanting to upgrade to 2.1.1
The problem Im having is that 2.x seems to have changed the default behaviour for bind to ips. As such, elasticsearch is ONLY binding to one address, and not all of the addresses on a host.

This is our current config (or lack there of):
# grep "network." /etc/elasticsearch/elasticsearch.yml

In 1.7.3 this gives us the default and desired behaviour:

# netstat -tulpn | grep java
tcp6       0      0 :::9300                 :::*                    LISTEN      13275/java
tcp6       0      0 :::9200                 :::*                    LISTEN      13275/java

We have this network set up on each node, with each node using 61.62.63 respectively:

 ~ # ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    inet brd scope host lo
       valid_lft forever preferred_lft forever
    inet scope global lo
       valid_lft forever preferred_lft forever
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    inet brd scope global bond0
       valid_lft forever preferred_lft forever

.60 is the service ip (we use keepalived to make sure requests from kibana go to which ever node is still up... assuming a still functioning cluster.) Keepalived is on a completely separate host.

Once we Upgrade to 2.1.1 using this guide
It ONLY binds to

As such, the nodes dont communicate with other, and we get this:

curl -s
  "error" : {
    "root_cause" : [ {
      "type" : "master_not_discovered_exception",
      "reason" : "waited for [30s]"
    } ],
    "type" : "master_not_discovered_exception",
    "reason" : "waited for [30s]"
  "status" : 503

on all nodes.

What I have tried.

network.bind_host = [ "",  "", "" ]
network.publish_host = ""

Where x is the correct byte for the node in question.

And combinations of the above, however we always get the same result. (master_not_discovered)

All of the nodes are on the same subnet, and there are no firewalls to get in the way. I even verified that the nodes can connect to the relevant ports by telnetting them.

The logs show this:

[2016-01-22 11:20:46,003][DEBUG][discovery.zen            ] [node3] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2016-01-22 11:20:53,603][DEBUG][discovery.zen            ] [node3] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2016-01-22 11:21:01,123][DEBUG][discovery.zen            ] [node3] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2016-01-22 11:21:08,448][WARN ][discovery                ] [node3] waited for 30s and no initial state was set by the discovery
[2016-01-22 11:21:08,453][DEBUG][cluster.service          ] [node3] processing [gateway_initial_state_recovery]: execute
[2016-01-22 11:21:08,453][DEBUG][cluster.service          ] [node3] processing [gateway_initial_state_recovery]: took 0s no change in cluster_state
[2016-01-22 11:21:08,470][DEBUG][http.netty               ] [node3] Bound http to address {}
[2016-01-22 11:21:08,472][DEBUG][http.netty               ] [node3] Bound http to address {}
[2016-01-22 11:22:56,566][WARN ][] [node3] failed to send ping to [{node3}{FhHZuCAkQEOikQO48eIWhg}{}{}{master=true}]

Which seems to imply it cant talk to itself...

Ideally, Id like to recreate the current 1.7.3 behaviour, but listening on is not required. Listening on the nodes eth0 ip AND the service ip (on loopback) is a must however.

Any pointers greatly appreciated.
Thanks in advance.


(Magnus B├Ąck) #2

Setting to should make it listen on all interfaces.

(system) #3