I been having an issue where my transport client is able to communicate
okay with elasticsearch on EC2, however soon as I rebuild the elasticsearch
box and the ip changes, the transportclient throws no node available
exception. It seems to keep the old ip and not resolve the host to
the latest ip. I am using hostname and not an actual ip when adding
transport address. Am I not setting something right, or is this a bug in
the transport client?
Here is how i am setting up the transport client:
Settings settings = ImmutableSettings.settingsBuilder()
.put("client.transport.sniff", false)
.put("client.transport.ignore_cluster_name", true).build();
transportClient = new TransportClient(settings);
String[] servers = clusterConfiguration.getServers();
if (servers != null) {
for (String server : servers) {
transportClient.addTransportAddress(new InetSocketTransportAddress(server,
clusterConfiguration.getPort()));
}
}
This is the exception i am getting when elasticsearch ip changes. Just to
make sure the box is resolving the hostname, I also did ping the host on
the same box and see if it picked up the lastest ip.
21:17:36.007 [elasticsearch[Time Bomb][generic][T#1]] DEBUG
org.elasticsearch.client.transport - [Time Bomb] failed to connect to node
[[#transport#-1][inet[elasticsearch-t1/10.60.95.159:9300]]], removed from
nodes list
org.elasticsearch.transport.ConnectTransportException:
[][inet[elasticsearch-t1/10.60.95.159:9300]] connect_timeout[30s]
at
org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:665)
~[elasticsearch-0.20.5.jar:na]
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:604)
~[elasticsearch-0.20.5.jar:na]
at
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:574)
~[elasticsearch-0.20.5.jar:na]
at
org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:127)
~[elasticsearch-0.20.5.jar:na]
at
org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.sample(TransportClientNodesService.java:302)
~[elasticsearch-0.20.5.jar:na]
at
org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:281)
[elasticsearch-0.20.5.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_15]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_15]
at java.lang.Thread.run(Thread.java:722) [na:1.7.0_15]
Being removed from node list may be due to the fact the node is not
addressing the correct cluster name. I saw you uses
"client.transport.ignore_cluster_name" = true. Note, this does disable
cluster name validation so you can start a transport client and try to
detect different clusters, but it does not disable client acceptance at
cluster side.
Jörg
Am 02.04.2013 02:30, schrieb ElasticRook:
[Time Bomb] failed to connect to node
[[#transport#-1][inet[elasticsearch-t1/10.60.95.159:9300]]], removed
from nodes list
org.elasticsearch.transport.ConnectTransportException:
[inet[elasticsearch-t1/10.60.95.159:9300]] connect_timeout[30s]
What you observe here is due to your setting "client.transport.sniff" =
false. If you disable sniffing, the TransportClient will not try to look
for other nodes of the cluster.
Jörg
Am 02.04.2013 02:30, schrieb ElasticRook:
I been having an issue where my transport client is able to
communicate okay with elasticsearch on EC2, however soon as I rebuild
the elasticsearch box and the ip changes, the transportclient throws
no node available exception.
Well, I currently only have one node in the cluster, so i kept the sniff
off. I think the problem is more of the transport client keeping the old ip
and not resolving to the new one when ip is changed on the node box. I had
to restart the service that uses the transport client for it to resolve the
hostname again and get the latest ip.
On Tuesday, April 2, 2013 1:01:48 AM UTC-7, Jörg Prante wrote:
What you observe here is due to your setting "client.transport.sniff" =
false. If you disable sniffing, the TransportClient will not try to look
for other nodes of the cluster.
Jörg
Am 02.04.2013 02:30, schrieb ElasticRook:
I been having an issue where my transport client is able to
communicate okay with elasticsearch on EC2, however soon as I rebuild
the elasticsearch box and the ip changes, the transportclient throws
no node available exception.
For failover, you need two data nodes your client is connected to. If
you have only one node, a client connection may fail without being able
to recover.
Jörg
Am 02.04.13 18:23, schrieb ElasticRook:
Well, I currently only have one node in the cluster, so i kept the
sniff off. I think the problem is more of the transport client keeping
the old ip and not resolving to the new one when ip is changed on the
node box. I had to restart the service that uses the transport client
for it to resolve the hostname again and get the latest ip.
On Tuesday, April 2, 2013 1:01:48 AM UTC-7, Jörg Prante wrote:
What you observe here is due to your setting
"client.transport.sniff" =
false. If you disable sniffing, the TransportClient will not try
to look
for other nodes of the cluster.
Jörg
Am 02.04.2013 02:30, schrieb ElasticRook:
> I been having an issue where my transport client is able to
> communicate okay with elasticsearch on EC2, however soon as I
rebuild
> the elasticsearch box and the ip changes, the transportclient
throws
> no node available exception.
Thanks for responding. Yes, I agree. I will be adding new nodes soon. In
that case, I will turn on the sniff. However that will not solve my
original issue.
On Tuesday, April 2, 2013 9:43:02 AM UTC-7, Jörg Prante wrote:
For failover, you need two data nodes your client is connected to. If
you have only one node, a client connection may fail without being able
to recover.
Jörg
Am 02.04.13 18:23, schrieb ElasticRook:
Well, I currently only have one node in the cluster, so i kept the
sniff off. I think the problem is more of the transport client keeping
the old ip and not resolving to the new one when ip is changed on the
node box. I had to restart the service that uses the transport client
for it to resolve the hostname again and get the latest ip.
On Tuesday, April 2, 2013 1:01:48 AM UTC-7, Jörg Prante wrote:
What you observe here is due to your setting
"client.transport.sniff" =
false. If you disable sniffing, the TransportClient will not try
to look
for other nodes of the cluster.
Jörg
Am 02.04.2013 02:30, schrieb ElasticRook:
> I been having an issue where my transport client is able to
> communicate okay with elasticsearch on EC2, however soon as I
rebuild
> the elasticsearch box and the ip changes, the transportclient
throws
> no node available exception.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.
The TransportClient maintains a persisting connection to one or more
nodes of a cluster. But only by using actively such a connection, the
system can get knowledge about whether the connection is valid or not
(and it times out or switches over to another connection if not usable
any more).
Note, a transport client does not store or receive the cluster state and
has no knowledge of the internal network state of the cluster currently
connected to. If you change the cluster network state, you can't expect
that a transport client is capable of tracking it. A TransportClient is
not directly attached to the cluster (for instance, it is invisble to
other transport clients).
If you want tighter client integration, you can use a node client, which
is aware of the current cluster network state. Well, you might see error
messages appearing more earlier in the log when connections are getting
unusable, but you will also notice also client reconnects, I'm quite sure.
Jörg
Am 02.04.13 19:12, schrieb ElasticRook:
Thanks for responding. Yes, I agree. I will be adding new nodes soon.
In that case, I will turn on the sniff. However that will not solve my
original issue.
I haven't had much of a chance to explore multi-node clusters in a
heavy-usage fail-over environment. I use static discovery, and give each
node and my TransportClient the list of all of the IP addresses. So when
one of the nodes goes off-line, the TransportClient still has other nodes
it can use and fail-over seems to work just fine.
Can I (indeed, should I) create just one NodeClient that is shared by
all threads within an application? This is the current usage of the
TransportClient, and migration of my code would be easier if the same usage
pattern (one NodeClient shared by all threads) was acceptable.
Since I can't pass a list of addresses, can I pass in the list of two or
more node addresses via the Java command line via
-Des.discovery.zen.ping.unicast.hosts=$HOSTS (as I do when I start ES
itself, since I don't have a per-installation unique elasticsearch.yml
file)?
And if the NodeClient is on the same localhost as the one ES server data
node (which is the case on my laptop and on several of our in-house QA
systems), does passing in the cluster name also look on the local host?
(When I create a TransportClient, I give it the one localhost address).
Thanks for any insights and corrections you can give me!
Regards,
Brian
On Tuesday, April 2, 2013 4:08:05 PM UTC-4, Jörg Prante wrote:
The TransportClient maintains a persisting connection to one or more
nodes of a cluster. But only by using actively such a connection, the
system can get knowledge about whether the connection is valid or not
(and it times out or switches over to another connection if not usable
any more).
Note, a transport client does not store or receive the cluster state and
has no knowledge of the internal network state of the cluster currently
connected to. If you change the cluster network state, you can't expect
that a transport client is capable of tracking it. A TransportClient is
not directly attached to the cluster (for instance, it is invisble to
other transport clients).
If you want tighter client integration, you can use a node client, which
is aware of the current cluster network state. Well, you might see error
messages appearing more earlier in the log when connections are getting
unusable, but you will also notice also client reconnects, I'm quite sure.
Absolutely, a NodeClient is in fact based on an ordinary node behind the
scenes, with all the bells and whistles of discovery, configuration,
logging, settings ...
You can configure a NodeClient as you would with a data node (see
"network.*" settings), and because : (all interfaces) is the default, I
think localhost will also be used if you do not configure the NodeClient at
all. At least, the hostname IP is used. There may be in some cases
JVM-related issues in hostname -> IP resolving if /etc/hosts is messed up
and both localhost and hostname points to 127.0.0.1, but this is not ES
specific.
Thank you very much for the quick response. I've updated all of my servers
and command-line drivers to accept the Client interface instead of the
TransportClient object, and then in one place I can optionally create
either a TransportClient or a NodeClient and then pass its reference along
as a Client.
I noticed (via TRACE-level logging) that when I create a NodeClient but
don't configure it at all, it goes into zen multicast discovery on
localhost. That works fine.
Then I updated my Java code to configure the NodeClient as follows:
This works also (where hostNamesToString converts an array of host names to
a comma-separated string). And I finally resurrected my 3-host cluster to
verify this. But I now have some additional questions:
With static discovery, if I just specify one of the host names of the
3-host cluster, will NodeClient discover the other two hosts? Or does it
work exactly like a data node in which it must know about all of the other
nodes when zen multicast is disabled?
Zen multicast is disabled for my data nodes due to recommendations on
this newsgroup as part of the strategy for avoiding split-brain situations.
But for this client-only non-data NodeClient, would you recommend leaving
zen multicast on and letting it dynamically find the hosts that are running
that cluster?
Thanks so much for your insights and recommendations!
Yes, it is enough for discovery to detect even only a single member of
the cluster to join the cluster. You do not need to specify all the cluster
members. A few members are better, to ensure discovery always succeeds.
Zen discovery and split-brain situations are two different topics. In
most cases, if only nodes go down and up and the network is stable, Zen
discovery works fine. The assumption is that per multicast request, all
nodes can "see" this event and react properly. But, if network connections
fail, while nodes stay up and continue to run, the cluster may enter a
so-called byzantine situation. A bad thing is a 50%:50% split. In such
case, nodes would have to agree about how to continue, without knowing what
the other half of nodes currently do. To avoid this, the total cluster node
count should be odd, not even, and the setting
discovery.zen.minimum_master_nodes
should be set to a high number, at least half the cluster node count plus
one. By doing that, if just a few number of nodes split, they are told to
not elect a new master, and can safely rejoin the cluster again later.
The message "no masterNode returned" is a trace level message and is just
informational. The node client discovery had not found a master node in the
nodes that answered to the multicast request (if there were nodes at all),
so the next step is going to elect a new master.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.