I am trying to use a tribe node to connect to two separate clusters, both running ESv2.2. The clusters both have a default configuration. When I try to start the ES tribe using the below elasticsearch.yml, it doesn't start. However, it doesn't give any errors as to why.
I have tried commenting out the discovery portions, and it starts. However, the tribe doesn't connect to the two cluster as it should. The link here doesn't show anything about how to connect to specific IP's, and since I installed ES using the repo, I can't seem to use this to get it working.
I've tried creating a .kibana index manually on the tribe, because I saw something saying I needed to do that, but then it gave a master_not_discovered_exception.
Am I doing something wrong with the configuration? Or are there examples of how to get it to work?
However yaml is sensitive, so if you don't have indentation it won't work. Maybe that's a formatting issue with your config paste, so please use the </> button to code format it.
Let's verify a few things. what do you get when running from a browser
http://[tribe 1 master or client node]:9200
http://[tribe 2 master or client node]:9200
What are the IP addresses that you use on both URLs? What is IP address of the machine where you run the browser from?
By default ES is binding to the localhost for each cluster: tribe1 and tribe2 (since you said both are using the default settings) If it is, you won't be able to connect the tribe node to these clusters.
I'm having the exact same problem, with nearly identical configurations. To answer your questions:
I get the default cluster information when I put in those addresses into a browser.
i.e. (http://[my tribe 1 IP:9200])
{
"name" : "War V",
"cluster_name" : "Demo",
"version" : {
"number" : "2.3.2",
"build_hash" : "b9e4a6acad4008027e4038f6abed7f7dba346f94",
"build_timestamp" : "2016-04-21T16:03:47Z",
"build_snapshot" : false,
"lucene_version" : "5.5.0"
},
"tagline" : "You Know, for Search"
}
On both the tribe 1 and tribe 2 masters, I have bound the IP address to itself, not local host. For example, if my ip address on tribe 1 master was 100.200.300.300 the elasticserach yml on tribe 1 master would look like :
network.host: 100.200.300.300
The main error that happens from what I can see is there is a failed ping when the tribe cluster tries to connect with the tribe 1 cluster and tribe two cluster. But I don't understand why this is. Any help would be greatly appreciated.
set network.host in all nodes, including the tribe node, in all clusters to 0.0.0.0 (this tells ES to listen to all network interfaces available per node)
comment out network.bind_host and network.publish_host
in the configuration for the tribe node, add the following parameters to each tribe's setup
tribe.<name>.network.bind_host: 0.0.0.0
tribe.<name>.network.publish_host: <tribe node's IP address>
Instead of trying to connect to two clusters, focus on one at a time. Once it's working, you'll know what to do with the other.
In the past v1.7.x or below, you don't need to add these parameters but in v2.1.1 (the version that I tested) I had to do this even though I was told I don't need that. I've not tested v2.3.x but since you are doing this, give it a try. If it works, great. If not, we can try other things.
Thanks for replying, I really appreciate the help. So I tried those things and I'm not sure why this isn't working but I have feeling it's something small that I'm missing. This is what my master node configuration looks like for Cluster I am trying to connect to (titled: "Demo"):
Those are the only changes I made to the master node, everything else is default.
This is the configurations for the tribe node: cluster.name: Tribetest
tribe.t1.cluster.name: Demo
tribe.t1.discovery.zen.ping.multicast.enabled: false
tribe.discovery.zen.ping.unicast.hosts: ["IP address of "Demo" Node"]
tribe.t1.network.bind_host: 0.0.0.0
tribe.t1.network.publish_host: ["The IP Address of THIS tribe Node"]
network.host: 0.0.0.0
Everything else is default.
The "Demo" Node starts up fine, but this is the error I get when I start up the "Tribetest" node:
[2016-05-06 10:59:36,076][WARN ][discovery.zen.ping.unicast] [Carnage/discovery] failed to send ping to [{#zen_unicast_6#}{::1}{[::1]:9300}]
RemoteTransportException[[Carnage][[::1]:9300][internal:discovery/zen/unicast]]; nested: ActionNotFoundTransportException[No handler for action [internal:discovery/zen/unicast]];
Thanks again for your help. Just really want to get this working.
Ok, just a little update. I realized I goofed up on the "tribe.discovery.zen.ping.unicast.hosts" should have been "tribe.t1.discovery.zen..etc." Anyways this is now the error that is coming up.
failed to send join request to master [{Wysper}{e1jiIMoESuSU6X9clSKD8w}{192.168.150.107}{192.168.150.107:9300}], reason [RemoteTransportException[[Wysper][192.168.150.107:9300][internal:discovery/zen/join]]; nested: ConnectTransportException[[Spectral/t1][172.16.5.1:9301] connect_timeout[30s]]; nested: NotSerializableExceptionWrapper[connect_timeout_exception: connection timed out: /172.16.5.1:9301]; ]
"Wysper" is the name of the node on the "Demo" cluster which is a good sign because it at least shows it is trying to connect.
Even though I suggested to configure network.host to be 0.0.0.0, for this parameter, use the machine's IP address of the tribe node. It should be the IP address that t1 can talk to.
If you are not sure, from t1's master node, ping the tribe node.
I just verified my current setup using ES v2.1.1, it's still working fine as described.
Awesome. It's working now, thank so much for your help. I also have found that an error occurs when I try and connect a tribe node to elasticsearch nodes that have been installed through a repository. I'm not sure what the issue is there but maybe you've heard something about that?
Ok. It seems that no one I have come across has used tribe nodes when they've used .rpm. I'll keep researching and trying to figure that out. Thanks again for your help.
If the repo is up to date (I think it is) you are using ES v2.3.x RPM. The directory layout is similar but if you are using systemd, it's a little bit different to launch the service.
But an exception is printed in ES log as below:
Exception in thread "main" BindTransportException[Failed to bind to [9300-9400]]; nested: ChannelException[Failed to bind to: /0.0.0.0:9400]; nested: AccessControlException[access denied ("java.net.SocketPermission" "localhost:9400" "listen,resolve")];
Likely root cause: java.security.AccessControlException: access denied ("java.net.SocketPermission" "localhost:9400" "listen,resolve")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
at java.security.AccessController.checkPermission(AccessController.java:884)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at java.lang.SecurityManager.checkListen(SecurityManager.java:1131)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:221)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193)
Any idea what can be the issue? How is it getting the 9400 port?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.