Hi all,
I'm using Amazon Elasticsearch with RestHighLevelClient for our team application and have noticed occasional small dips in our success rate from canary traffic calling our API due to the error below.
Caused by: java.net.ConnectException: Timeout connecting to [search-mydomain-htb7yrdp7xvt2k3apw5phcz7ku.us-east-1.es.amazonaws.com/52.72.66.108:443]
at org.apache.http.nio.pool.RouteSpecificPool.timeout(RouteSpecificPool.java:169) ~[Apache-HttpComponents-HttpCore-4.4.x.jar:?]
at org.apache.http.nio.pool.AbstractNIOConnPool.requestTimeout(AbstractNIOConnPool.java:628) ~[Apache-HttpComponents-HttpCore-4.4.x.jar:?]
at org.apache.http.nio.pool.AbstractNIOConnPool$InternalSessionRequestCallback.timeout(AbstractNIOConnPool.java:894) ~[Apache-HttpComponents-HttpCore-4.4.x.jar:?]
at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:184) ~[Apache-HttpComponents-HttpCore-4.4.x.jar:?]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:214) ~[Apache-HttpComponents-HttpCore-4.4.x.jar:?]
at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:158) ~[Apache-HttpComponents-HttpCore-4.4.x.jar:?]
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351) ~[Apache-HttpComponents-HttpCore-4.4.x.jar:?]
at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:194) ~[httpasyncclient-4.1.x.jar:?]
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[httpasyncclient-4.1.x.jar:?]
... 1 more
This is my configuration for the RestHighLevelClient.
@Provides
@Singleton
public RestHighLevelClient providesHttpClient(@Named("CurrentRegion") String currentRegion,
@Named("DomainEndpoint") String endpoint) {
AWSCredentialsProvider awsCredentialsProvider = new DefaultAWSCredentialsProviderChain();
AWS4Signer awsSigV4Signer = new AWS4Signer();
awsSigV4Signer.setRegionName(currentRegion);
awsSigV4Signer.setServiceName(AES_SERVICE_NAME);
HttpRequestInterceptor requestInterceptor = new AWSRequestSigningApacheInterceptor(AES_SERVICE_NAME,
awsSigV4Signer, awsCredentialsProvider);
return new RestHighLevelClient(RestClient.builder(HttpHost.create(endpoint))
.setRequestConfigCallback(rcb -> rcb.setHttpClientConfigCallback(hacb -> hacb.addInterceptorLast(requestInterceptor)));
}
I have tried setting the RequestConfigCallback below but the problem persisted
return new RestHighLevelClient(RestClient.builder(HttpHost.create(endpoint))
.setRequestConfigCallback(rcb -> rcb.setSocketTimeout(60000).setConnectTimeout(30000).setConnectionRequestTimeout(0))
.setHttpClientConfigCallback(hacb -> hacb.addInterceptorLast(requestInterceptor)));
I have also tried adding a retry strategy and that seemed to get rid of the issue but with a trade off of the RestHighLevelClient created and close per call instead of being a singleton. I believe this is not recommended per documentation.
Regardless, I still do not understand what is the root cause here causing the ConnectionTimeout error and I have only seen this with out canary traffic which is running once per minute.
Thank you in advance!