Connection Exception: "An existing connection was forcibly closed by the remote host"

I'm using the RestHighLevelClient to create indices and mappings on the ES cluster.
The problem is that when I attempt to connect to the cluster after a period of inactivity, the first call throws the following:

"ElasticSearch initialisation exception:java.io.IOException: An existing connection was forcibly closed by 
 the remote host"

On the next attempt, the operation succeed.
Here's my client configuration:

@Configuration
public class ElasticSearchConfiguration {

private RestHighLevelClient restHighLevelClient;

@PostConstruct
public void init() {
      RestClientBuilder builder = 
      RestClient.builder(new HttpHost(config.getHost(), 
                    config.getPort())).setMaxRetryTimeoutMillis(60000);
      RestClientBuilder.RequestConfigCallback requestConfigCallback = requestConfigBuilder ->
                    requestConfigBuilder.setConnectTimeout(5000).setSocketTimeout(60000);
      builder.setRequestConfigCallback(requestConfigCallback);
      restHighLevelClient = new RestHighLevelClient(builder);
      }
}

@Bean
public RestHighLevelClient restHighLevelClient() {
    return restHighLevelClient;
}

@PreDestroy
public void cleanup() {
    try {
        log.info("Closing the ES REST client");
        this.restHighLevelClient.close();
    } catch (IOException ioe) {
        log.error("Problem occurred when closing the ES REST client", ioe);
    }
}

In Spring I then autowire the client into a @Component bean where it's reused for every request to the cluster. So I'm not creating a new instance of the client for each request.

How can I fix the connection Exception?

I think it's best to retry on this kind of exception, because it can happen for other reasons than a period of inactivity, and retrying is the right thing to do whenever it's encountered.

However the fact that it occurs after every period of inactivity suggests that there's something on your network in between your client and Elasticsearch that's forcibly closing idle connections. Sometimes people configure their firewalls to do this. You could make this exception less common by fixing whatever it is that's closing the connection, or by avoiding a period of inactivity by periodically sending a simple request such as GET /.

Hi David thanks for the help.
I'm checking if the problem might be the firewall.
But, I wonder, shouldn't be the RestHighLevelClient itself responsible to discard the expired/unusable connections in the pool and always offer a valid connection?

you were right the ES cluster is behind a load balancer and I'm connecting to the load balancer.
The load balancer is configured with a 5 min idle timeout.

Regarding your suggestion to do a retry, given I've configured the client with:

.setMaxRetryTimeoutMillis(60000)

I thought the client itself would do a retry. Am I correct?
Otherwise, rather than me implementing a retry or sending a GET /, is it possible to fix this just by configuring the RestHighLevelClient? For example to enable sending keep alive messages?

Looking at the code of org.elasticsearch.client.RestClient, it looks like the retry on failures is attempted only when the client is built with pointing to multiple ES nodes. Correct me if I'm wrong.

But that's not my case because I only specify the load balancer address. I'm tempted to 'cheat' by building the RestClient by passing the same loadbalancer ip/port twice so to force the retry.

It looks like we deduplicate nodes by address, so I think this won't help.

I think we try nodes one-by-one, but don't retry on a node that has seen a failure unless it's the last resort. However I might be misunderstanding so I'm going to ask a colleague.

Retrying this yourself seems to be the thing to do.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.