It happends quite randomly and quite often. I restart the container where the application is running a few more times and it connects over transport client to the cluster and index data. it can also happen that it connects a few times in a row and then it starts not connecting. Connecting i mean each time i start the container and the application builds the configuration initialization for ES.
I am using following parameters for Settings:
"cluster.name",
"client.transport.sniff" is always false,
and "client.transport.nodes_sampler_interval"
I've sometimes seen the same error message as well, and in my case the cause was long GC pauses. The default connection timeout is 5 seconds, at least for the Java transport client. If a garbage collection run in ES takes longer than that, the client will fail with that error.
Check your ES logs if there are any rows like this:
[2017-05-27T04:00:24,507][WARN ][o.e.m.j.JvmGcMonitorService] [7LnWwxW] [gc][66855] overhead, spent [8.4s] collecting in the last [9.2s]
If yes, you need to increase the heap size for ES.
I solved the problem. We were using a load balancer infront of the cluster and I was giving the host name of this load balancer to build the Transport with the underlying cluster. The IPs behind this LB were constantly changing, so we had connection sometimes and sometimes not. We initialized the Transport Client directly with the Node IPs (not LB) and it is now stable. Snippet:
for (String clusterNode : clusterNodes) {
client.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(clusterNode), Integer.valueOf(9300)));
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.