REST client initialization

Assume I have a cluster containing master-eligible nodes, data nodes and coordinating nodes. When I initialize a REST client, which of the above node types should I include? All of the nodes, only the data nodes, only the coordinating nodes, or some combination thereof?

For example:
RestClient restClient = RestClient.builder(
new HttpHost("master-eligible-node-1", 9200, "http"),
new HttpHost("master-eligible-node-2", 9200, "http"),
new HttpHost("master-eligible-node-3", 9200, "http")).build();

Or,
RestClient restClient = RestClient.builder(
new HttpHost("master-eligible-node-1", 9200, "http"),
new HttpHost("master-eligible-node-2", 9200, "http"),
new HttpHost("master-eligible-node-3", 9200, "http"),
new HttpHost("data-node-1", 9200, "http"),
new HttpHost("data-node-2", 9200, "http"),
new HttpHost("data-node-n", 9200, "http"),
new HttpHost("coord-node-1", 9200, "http"),
new HttpHost("coord-node-n", 9200, "http")).build();

As a second part, is it possible to initialize the REST client only to coordinating nodes, or only to data nodes? I was thinking that I might want to do that if I want to isolate query operations vs indexing operations.

Thanks!

I would take out only dedicated masters, it is better not to load them with requests as their goal is to be there to potentially become the master once needed (also one of them will be the master).

You could send searches only to coordinating nodes, then they will proxy the requests and also handle the reduce part of the search but all your requests will always have to be forwarded to the data nodes where the data sits. This may be beneficial but I would not do it to start with, I would try to test the benefits of such a setup first.

We have been working on a new feature (coming with 6.4) that will allow to have custom node selectors. With that it will be possible to have for instance an allocation aware selector, which tries to send request to the closest nodes. I am not sure about sending indexing to data nodes and search to coordinating nodes only, I would test it before assuming it's beneficial. The only way to do that without the node selector improvement is though to have two separate client instances that point to two different set of nodes.

Cheers
Luca

Hi Luca, thank you for the insight.

This is rather fundamental - everyone starts by creating a REST client, so I would like to understand your recommendation fully. I understand not including Masters in the REST client initialization. It sounds like there is no significant advantage to having separate REST clients for query operations vs indexing operations. If that is the case, I could use a single REST client. Should that single REST client be initialized only with Data nodes? If I do that, is it still useful to even create Coordinating nodes?

Thanks again.

I think that the benefit of having coordinating only nodes needs to be measured in practice. Same as separating indexing and searching. It could be an optimization to make later if you encounter problems. In case you make your client go directly to the data nodes, for sure having coordinating only nodes will result in having them idle all the time assuming that no other clients/requests go through them.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.