We are having Elasticsearch cluster with version 8.3 with 3 masternode, also we have search API setup with elasticsearch java client to query the data from Elasticsearch using rest client.
Search API java client is setup to send the query request on all three master nodes, e.g. as below
RestClient restClient = RestClient.builder(
new HttpHost("localhost", 9200, "http"),
new HttpHost("localhost", 9201, "http"),
new HttpHost("localhost", 9202, "http")).build();
Problem Statement -
The problem what we have here is while querying the data from elasticsearch cluster search API java client send the request to only one elasticsearch node instead all three node and that one is the first node from above list ("localhost", 9200, "http") and once this node goes down then only it will send the query request to second node
Yes we expect for all queries it should go to all nodes so that we have load distribution on all nodes, to server better for bulk queries from java client
Thank you again for your responce, checking the coordinating node part.
Also want to check about the multiple heavy query request coming from search API e.g. 100k a day, as in present situation all 100k queries are requesting to one node only i.e. the first node from the list.
this situation may impact on that server with heavy request & network load with all those 100k search from search API
RestClient sends requests to all nodes in succession, and there are extensive tests for this.
A common cause for the problem you describe is to have a new RestClient created for each request. Every new client starts with the first node, causing the issue. It also slows down the application a lot since RestClient is a heavyweight object.
Can you check this? The RestClient should be a singleton object that is reused for all requests, and not created anew every time.
Thank you for your response we have tried the singleton configuration in our sprinboot application but still we are facing the issue with connection distribution towards ES hosts
Please check the blow code, this is how we are creating singleton restclient object and the same object is being injected across all the handlers and using the same restclient we are creating the transport layer and elasticsearchclient to search the rest templates on elastic side.
Now the problem is closing the connection
Problem #1 -
If we close the transport layer and shutdown the elasticsearchclient then for the next sequence request we are getting null pointer exception because we have closed two of the layers i.e. transport & elasticsearchclient
Problem #2 -
Now if we dont close the connection we have the memory issue on client (requestor) server side
"Configuration of elastic client as below"
public class ElasticConfig { @Bean public RestClient getESRestClientBuilder() {
String eshostnames = env.getProperty("eshostnames");
Integer port = Integer.parseInt(env.getProperty("port"));
String instance_id = env.getProperty("instance_id");
String keystore = env.getProperty("keystore");
String key_pass = env.getProperty("keypass");
.
.
.
.
. return restClientBuilder.build();
}
}
Handler code as below
transport = new RestClientTransport(restClient,
new JacksonJsonpMapper().withAttribute(JsonpMapperFeatures.SERIALIZE_TYPED_KEYS, false));
client = new ElasticsearchClient(transport);
response = elasticsearchclient.searchTemplate(searchTemplateRequest, Object.class);
if (transport != null) { transport.close();
}
if (elasticsearchclient != null) { elasticsearchclient.shutdown();
}
The transport is just a thin layer on top of RestClient : if you close it, it will also close the RestClient, which explains the null pointer exception.
Regarding the memory issue (#2), can you make sure that adding @Bean on getESRestClientBuilder() is enough to make it a singleton? For example by adding a log statement in that method to trace how often it's called.
I will make sure to make the class as singleton object , is it possible for you to advice during which phase the restclient should close the connection and where
We have implemented this earlier, when we are not closing the connection we are getting memory issue
We have below memory configuration on web servers
Total memory - 16 GB
Also we want to understand the connection between restclient & http async, how it will handle if it get many static request how singleton will handle as we expect approx 1 million daily traffic hit
Please elaborate on "memory issue". Have you inspected the JVM heap to understand where the issue comes from?
Regarding async, restclient accepts both blocking and async requests. Actually, under the hood everything is async and a blocking requests justs blocks the calling thread until the internal async request is finished.
Sorry for responding late, actually after configuring the singleton correctly so far we are not observing memory issue the way it was earlier, we are in monitoring by peforming certain test.
Will come up if identify any further issue on this.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.