Tribe Node Connect to specific IP's

I am trying to use a tribe node to connect to two separate clusters, both running ESv2.2. The clusters both have a default configuration. When I try to start the ES tribe using the below elasticsearch.yml, it doesn't start. However, it doesn't give any errors as to why.

cluster.name: tribe
node.name: tribenode
tribe:
t1:
cluster.name: tribe1
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["IP.of.tribe1"]
t2:
cluster.name: tribe2
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["IP.of.tribe2"]
network.host: 0.0.0.0

I have tried commenting out the discovery portions, and it starts. However, the tribe doesn't connect to the two cluster as it should. The link here doesn't show anything about how to connect to specific IP's, and since I installed ES using the repo, I can't seem to use this to get it working.

I've tried creating a .kibana index manually on the tribe, because I saw something saying I needed to do that, but then it gave a master_not_discovered_exception.

Am I doing something wrong with the configuration? Or are there examples of how to get it to work?

Thanks for the help!

FYI you were looking at the 1.4 documentation there, not 2.2. https://www.elastic.co/guide/en/elasticsearch/reference/2.2/modules-tribe.html is the right page.

However yaml is sensitive, so if you don't have indentation it won't work. Maybe that's a formatting issue with your config paste, so please use the </> button to code format it.

Sorry, here is the formatted yml:

cluster.name: tribe
node.name: tribenode
tribe:
    t1:
        cluster.name:   tribe1
          discovery.zen.ping.multicast.enabled: false
          discovery.zen.ping.unicast.hosts: ["IP.of.tribe1"]
    t2:
        cluster.name:   tribe2
          discovery.zen.ping.multicast.enabled: false
          discovery.zen.ping.unicast.hosts: ["IP.of.tribe2"]

Let's verify a few things. what do you get when running from a browser

  • http://[tribe 1 master or client node]:9200
  • http://[tribe 2 master or client node]:9200

What are the IP addresses that you use on both URLs? What is IP address of the machine where you run the browser from?

By default ES is binding to the localhost for each cluster: tribe1 and tribe2 (since you said both are using the default settings) If it is, you won't be able to connect the tribe node to these clusters.

I'm having the exact same problem, with nearly identical configurations. To answer your questions:

  1. I get the default cluster information when I put in those addresses into a browser.
    i.e. (http://[my tribe 1 IP:9200])
    {
    "name" : "War V",
    "cluster_name" : "Demo",
    "version" : {
    "number" : "2.3.2",
    "build_hash" : "b9e4a6acad4008027e4038f6abed7f7dba346f94",
    "build_timestamp" : "2016-04-21T16:03:47Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.0"
    },
    "tagline" : "You Know, for Search"
    }

  2. On both the tribe 1 and tribe 2 masters, I have bound the IP address to itself, not local host. For example, if my ip address on tribe 1 master was 100.200.300.300 the elasticserach yml on tribe 1 master would look like :
    network.host: 100.200.300.300

The main error that happens from what I can see is there is a failed ping when the tribe cluster tries to connect with the tribe 1 cluster and tribe two cluster. But I don't understand why this is. Any help would be greatly appreciated.

To make your life easy, let's try this

  • set network.host in all nodes, including the tribe node, in all clusters to 0.0.0.0 (this tells ES to listen to all network interfaces available per node)

  • comment out network.bind_host and network.publish_host

  • in the configuration for the tribe node, add the following parameters to each tribe's setup

  • tribe.<name>.network.bind_host: 0.0.0.0

  • tribe.<name>.network.publish_host: <tribe node's IP address>

Instead of trying to connect to two clusters, focus on one at a time. Once it's working, you'll know what to do with the other.

In the past v1.7.x or below, you don't need to add these parameters but in v2.1.1 (the version that I tested) I had to do this even though I was told I don't need that. I've not tested v2.3.x but since you are doing this, give it a try. If it works, great. If not, we can try other things.

Thanks for replying, I really appreciate the help. So I tried those things and I'm not sure why this isn't working but I have feeling it's something small that I'm missing. This is what my master node configuration looks like for Cluster I am trying to connect to (titled: "Demo"):

cluster.name: Demo
network.host: 0.0.0.0

Those are the only changes I made to the master node, everything else is default.

This is the configurations for the tribe node:
cluster.name: Tribetest

tribe.t1.cluster.name: Demo
tribe.t1.discovery.zen.ping.multicast.enabled: false
tribe.discovery.zen.ping.unicast.hosts: ["IP address of "Demo" Node"]
tribe.t1.network.bind_host: 0.0.0.0
tribe.t1.network.publish_host: ["The IP Address of THIS tribe Node"]

network.host: 0.0.0.0

Everything else is default.

The "Demo" Node starts up fine, but this is the error I get when I start up the "Tribetest" node:

[2016-05-06 10:59:36,076][WARN ][discovery.zen.ping.unicast] [Carnage/discovery] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
RemoteTransportException[[Carnage][[::1]:9300][internal:discovery/zen/unicast]]; nested: ActionNotFoundTransportException[No handler for action [internal:discovery/zen/unicast]];

Thanks again for your help. Just really want to get this working.

"Carnage" is the name of the "Tribetest" cluster node. I didn't set a specific node name.

Ok, just a little update. I realized I goofed up on the "tribe.discovery.zen.ping.unicast.hosts" should have been "tribe.t1.discovery.zen..etc." Anyways this is now the error that is coming up.

failed to send join request to master [{Wysper}{e1jiIMoESuSU6X9clSKD8w}{192.168.150.107}{192.168.150.107:9300}], reason [RemoteTransportException[[Wysper][192.168.150.107:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[Spectral/t1][172.16.5.1:9301] connect_timeout[30s]]; nested: NotSerializableExceptionWrapper[connect_timeout_exception: connection timed out: /172.16.5.1:9301]; ]

"Wysper" is the name of the node on the "Demo" cluster which is a good sign because it at least shows it is trying to connect.

Even though I suggested to configure network.host to be 0.0.0.0, for this parameter, use the machine's IP address of the tribe node. It should be the IP address that t1 can talk to.
If you are not sure, from t1's master node, ping the tribe node.

I just verified my current setup using ES v2.1.1, it's still working fine as described.

Awesome. It's working now, thank so much for your help. I also have found that an error occurs when I try and connect a tribe node to elasticsearch nodes that have been installed through a repository. I'm not sure what the issue is there but maybe you've heard something about that?

Glad to hear that it's working on your side now... I'm not sure what you meant by "... installed through a repository"

Sorry, I mean I installed it this way: https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-repositories.html

That would be similar to .rpm installation if you download the .rpm file and run it manually.

After the install, you'll still need to adjust the configuration to what you want. Without doing so, I expect things would not work properly.

I don't do yum or apt-get install and have not heard about any issues at all.

Ok. It seems that no one I have come across has used tribe nodes when they've used .rpm. I'll keep researching and trying to figure that out. Thanks again for your help.

If the repo is up to date (I think it is) you are using ES v2.3.x RPM. The directory layout is similar but if you are using systemd, it's a little bit different to launch the service.

https://www.elastic.co/guide/en/elasticsearch/reference/2.3/setup-dir-layout.html

As far as tribe node concerns, you still need to make an adjustment to the configuration file.

Otherwise there is no difference to the config settings.

Thanks @warkolm

My elasticsearch.yml for the tribe node is as below:

network.host: 0.0.0.0
transport.tcp.port: 9300
http.port: 9200
http.enabled: true

tribe.t1.cluster.name: pr_site
tribe.t1.discovery.zen.ping.unicast.hosts: ["172.16.173.12:9300"]
tribe.t1.discovery.zen.ping.multicast.enabled: false
tribe.t1.path.conf: /home/user1/Desktop/UPI/elasticsearch-2.1.1/config
tribe.t1.path.plugins: /home/user1/Desktop/UPI/elasticsearch-2.1.1/plugins
tribe.t1.network.bind_host: 0.0.0.0
tribe.t1.network.publish_host: ["172.16.153.14"]

tribe.t2.cluster.name: ha_site
tribe.t2.discovery.zen.ping.unicast.hosts: ["172.16.173.77:9300"]
tribe.t2.discovery.zen.ping.multicast.enabled: false
tribe.t2.path.conf: /home/user1/Desktop/UPI/elasticsearch-2.1.1/config
tribe.t2.path.plugins: /home/user1/Desktop/UPI/elasticsearch-2.1.1/plugins
tribe.t2.network.bind_host: 0.0.0.0
tribe.t2.network.publish_host: ["172.16.153.14"]

But an exception is printed in ES log as below:
Exception in thread "main" BindTransportException[Failed to bind to [9300-9400]]; nested: ChannelException[Failed to bind to: /0.0.0.0:9400]; nested: AccessControlException[access denied ("java.net.SocketPermission" "localhost:9400" "listen,resolve")];
Likely root cause: java.security.AccessControlException: access denied ("java.net.SocketPermission" "localhost:9400" "listen,resolve")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
at java.security.AccessController.checkPermission(AccessController.java:884)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at java.lang.SecurityManager.checkListen(SecurityManager.java:1131)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:221)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)

Any idea what can be the issue? How is it getting the 9400 port?

I'm assuming the log above is from the tribe node...

run "netstat -lntup | grep 9" to see what is attached to IP:port in this tribe node

also do the following to make sure ES is using the configuration file that you think it is using

run "ps -ef | grep elastic"