Transportclient retry logic/resiliency

I have added 3 trasportclient nodes while creating a client.

Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", clusterName)
.put("client.transport.sniff", true)
.build();
TransportClient client = new TransportClient(settings);
client.addTransportAddresses(new InetSocketTransportAddress(esHos1,
esPort1));
client.addTransportAddresses(new InetSocketTransportAddress(esHost2,
esPort2));
client.addTransportAddresses(new InetSocketTransportAddress(esHost3,
esPort3));

esHost1 and esHost2 are down. But esHost3 is running. However, when I try
to connect, its giving NoNodeAvailableException. What I was expecting as
below as per the Round Robin logic for each actions:

  1. try to connect to esHost1
  2. NoNodeAvailableException after ping.timeout
  3. try to connect to esHost2
  4. NoNodeAvailableException after ping.timeout
  5. try to connect to esHost3 - and successfully being able to connect.

So now I am beginning to think that the Round Robin is actually for the
actions but not in case if there is a NoNodeAvailableException. Is that
correct?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1a5ef4f3-1046-4db6-9803-a308f51cf79b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you get a NoNodeAvailableException, none of the hosts are available.

If you have "sniff" on, TransportClient tries to connect to all discovered
nodes, this may take up to 10-15 seconds.

Steps 1-5 are performed in parallel, not sequential, i.e. each added
transport address is instantly connected (or not).

The correct method is to add the known host addresses with
addTransportAddresses() and afterwards check the connectedNodes() method.
If it returns empty list, no nodes could be found.

Jörg

On Fri, Dec 26, 2014 at 3:41 PM, Arindam Bose abose78@gmail.com wrote:

I have added 3 trasportclient nodes while creating a client.

Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", clusterName)
.put("client.transport.sniff", true)
.build();
TransportClient client = new TransportClient(settings);
client.addTransportAddresses(new InetSocketTransportAddress(esHos1,
esPort1));
client.addTransportAddresses(new InetSocketTransportAddress(esHost2,
esPort2));
client.addTransportAddresses(new InetSocketTransportAddress(esHost3,
esPort3));

esHost1 and esHost2 are down. But esHost3 is running. However, when I try
to connect, its giving NoNodeAvailableException. What I was expecting as
below as per the Round Robin logic for each actions:

  1. try to connect to esHost1
  2. NoNodeAvailableException after ping.timeout
  3. try to connect to esHost2
  4. NoNodeAvailableException after ping.timeout
  5. try to connect to esHost3 - and successfully being able to connect.

So now I am beginning to think that the Round Robin is actually for the
actions but not in case if there is a NoNodeAvailableException. Is that
correct?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1a5ef4f3-1046-4db6-9803-a308f51cf79b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1a5ef4f3-1046-4db6-9803-a308f51cf79b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGP9%2BvDXaf05bN7nvQE1fCWMybwSF%3DjqBgaw7yOOajFFA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I understand the sniffing and steps 1-5 being parallel.

Now, I am still trying to infer why in my use case, I am getting the
NoNodeAvailableException! I know esHos1 and esHost2 are down. Only esHost3
is up and running. Considering what you said (steps 1-5 being parallel), so
in my case, even before transportclient can sniff around and check that the esHos1
and esHost2 are down, I am firing the persisting executions/actions. So
transportclient did not get the time to remove those nodes from the list as
was created by the initial 'addTransportAddresses'. So as a result as per
the round robin logic the first node ehos1 was chosen to carry out the
requested executions. As this host was down so I got the
NoNodeAvailableException. Is that correct?

If that is true, then isnt there any retry logic in the transportclient to
say, if the execution has failed in 1 node, to propagate the same execution
to anyother node?

On Friday, December 26, 2014 2:30:41 PM UTC-6, Jörg Prante wrote:

If you get a NoNodeAvailableException, none of the hosts are available.

If you have "sniff" on, TransportClient tries to connect to all discovered
nodes, this may take up to 10-15 seconds.

Steps 1-5 are performed in parallel, not sequential, i.e. each added
transport address is instantly connected (or not).

The correct method is to add the known host addresses with
addTransportAddresses() and afterwards check the connectedNodes() method.
If it returns empty list, no nodes could be found.

Jörg

On Fri, Dec 26, 2014 at 3:41 PM, Arindam Bose <abo...@gmail.com
<javascript:>> wrote:

I have added 3 trasportclient nodes while creating a client.

Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", clusterName)
.put("client.transport.sniff", true)
.build();
TransportClient client = new TransportClient(settings);
client.addTransportAddresses(new InetSocketTransportAddress(esHos1,
esPort1));
client.addTransportAddresses(new InetSocketTransportAddress(esHost2,
esPort2));
client.addTransportAddresses(new InetSocketTransportAddress(esHost3,
esPort3));

esHost1 and esHost2 are down. But esHost3 is running. However, when I try
to connect, its giving NoNodeAvailableException. What I was expecting as
below as per the Round Robin logic for each actions:

  1. try to connect to esHost1
  2. NoNodeAvailableException after ping.timeout
  3. try to connect to esHost2
  4. NoNodeAvailableException after ping.timeout
  5. try to connect to esHost3 - and successfully being able to connect.

So now I am beginning to think that the Round Robin is actually for the
actions but not in case if there is a NoNodeAvailableException. Is that
correct?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1a5ef4f3-1046-4db6-9803-a308f51cf79b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1a5ef4f3-1046-4db6-9803-a308f51cf79b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b8560629-00d0-40f3-80e9-4367b6025c2c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The TransportClient does not perform a retry/error logic on connected nodes.

When an action is executed, the TransportClient picks a connection from the
pool and executes the action once on this connection. When there is no
connection, the NoNodeAvailableException is thrown. When the action fails,
it is reported straight to the user. There would be not much sense in
retrying actions silently without user interaction, for example if an index
operation fails, the user should be in charge of deciding what to do next.

Jörg

On Wed, Dec 31, 2014 at 3:22 PM, Arindam Bose abose78@gmail.com wrote:

I understand the sniffing and steps 1-5 being parallel.

Now, I am still trying to infer why in my use case, I am getting the
NoNodeAvailableException! I know esHos1 and esHost2 are down. Only esHost3
is up and running. Considering what you said (steps 1-5 being parallel),
so in my case, even before transportclient can sniff around and check that
the esHos1 and esHost2 are down, I am firing the persisting
executions/actions. So transportclient did not get the time to remove those
nodes from the list as was created by the initial 'addTransportAddresses'.
So as a result as per the round robin logic the first node ehos1 was chosen
to carry out the requested executions. As this host was down so I got the
NoNodeAvailableException. Is that correct?

If that is true, then isnt there any retry logic in the transportclient to
say, if the execution has failed in 1 node, to propagate the same execution
to anyother node?

On Friday, December 26, 2014 2:30:41 PM UTC-6, Jörg Prante wrote:

If you get a NoNodeAvailableException, none of the hosts are available.

If you have "sniff" on, TransportClient tries to connect to all
discovered nodes, this may take up to 10-15 seconds.

Steps 1-5 are performed in parallel, not sequential, i.e. each added
transport address is instantly connected (or not).

The correct method is to add the known host addresses with
addTransportAddresses() and afterwards check the connectedNodes() method.
If it returns empty list, no nodes could be found.

Jörg

On Fri, Dec 26, 2014 at 3:41 PM, Arindam Bose abo...@gmail.com wrote:

I have added 3 trasportclient nodes while creating a client.

Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", clusterName)
.put("client.transport.sniff", true)
.build();
TransportClient client = new TransportClient(settings);
client.addTransportAddresses(new InetSocketTransportAddress(esHos1,
esPort1));
client.addTransportAddresses(new InetSocketTransportAddress(esHost2,
esPort2));
client.addTransportAddresses(new InetSocketTransportAddress(esHost3,
esPort3));

esHost1 and esHost2 are down. But esHost3 is running. However, when I
try to connect, its giving NoNodeAvailableException. What I was
expecting as below as per the Round Robin logic for each actions:

  1. try to connect to esHost1
  2. NoNodeAvailableException after ping.timeout
  3. try to connect to esHost2
  4. NoNodeAvailableException after ping.timeout
  5. try to connect to esHost3 - and successfully being able to connect.

So now I am beginning to think that the Round Robin is actually for the
actions but not in case if there is a NoNodeAvailableException. Is that
correct?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/1a5ef4f3-1046-4db6-9803-a308f51cf79b%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1a5ef4f3-1046-4db6-9803-a308f51cf79b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b8560629-00d0-40f3-80e9-4367b6025c2c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b8560629-00d0-40f3-80e9-4367b6025c2c%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHfROh0%2BxbBDZ1VLentFJr3wNyEW%2B86ZaW5_xHrNe0inA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.