Option "client.transport.sniff" not working with v5.4.0 Transport Client

Hi,

We have just upgraded our Elasticsearch cluster from v5.3.2 to v5.4.0 but are now experiencing problems with our Java apps that connect to the cluster using the Transport Client. After some digging we found out that the problem seems to be related to the “client.transport.sniff” option, which we have set to true. If this option is set to false the problem disappears.

We also discovered that it is the version of the Transport Client library that matters here and not that of the cluster:

  • a v5.3.2 Transport Client can connect to both a v5.3.2 and a v5.4.0 cluster
  • a v5.4.0 Transport Client can neither connect to a v5.3.2 nor a v5.4.0 cluster

Before opening an issue, I first wanted to ask around if anyone else has been experiencing the same problem or if we have maybe just missed some deliberate changes to the Transport Client in v5.4.0 that require us to configure it differently.

Any help would be greatly appreciated.



Below you will find some information related to our setup and the described problem:
  • 1 node cluster
  • hosted via Docker (v1.13.1)
  • on top of centos 6
  • container started using command:
    The container is started using the following command:
    docker run -p 9200:9200 -p 9300:9300 docker.elastic.co/elasticsearch/elasticsearch:5.4.0 elasticsearch -E xpack.security.enabled=false -E network.publish_host=192.168.200.8

The code that we use to connect to the cluster from our Java apps:

Settings settings = Settings.builder()
  .put("cluster.name", "docker-cluster")
  .put("client.transport.sniff", true)
  .build();

TransportClient client = new PreBuiltTransportClient(settings)
  .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("192.168.200.8"), 9300));

SearchResponse response = client.prepareSearch("_all").get();
System.out.println(response.getHits().getTotalHits());
client.close();

This works perfectly with the v5.3.2 client but fails with the following exception if we use the v5.4.0 client:

Exception in thread "main" NoNodeAvailableException[None of the configured nodes are available: [{#transport#-1}{5V1ekMUAQfu_7mZBI-KlrQ}{192.168.200.8}{192.168.200.8:9300}]]
	at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:348)
	at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:246)
	at org.elasticsearch.client.transport.TransportProxyClient.execute(TransportProxyClient.java:59)
	at org.elasticsearch.client.transport.TransportClient.doExecute(TransportClient.java:366)
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:408)
	at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:80)
	at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:54)
	at org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:62)

Hi @munk77,

I have exactly the same problem.
My unit tests are failing because they can't connect to my local cluster (127.0.0.1:9300).
It started to happen after I upgraded my ES and transport library to 5.4.0.
As you said, when I change the sniff to false it works. :frowning:

I have received the same exception that you said, but there is another in the beginning:

org.elasticsearch.transport.ReceiveTimeoutTransportException: [][127.0.0.1:9300][cluster:monitor/state] request_id [2] timed out after [5006ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:925) [elasticsearch-5.4.0.jar:5.4.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.4.0.jar:5.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
  • I don't know why, but when I'm in debug mode and I delay the breakpoint for a little time the sniff=true works :open_mouth:

@danilo.akamine

There is another topic on this board that describes the same problem: Elastic 5.4.0 TransportClient does not work

It seems that the problem might already be fixed in the upcomming v5.4.1 release.

thanks @munk77!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.