Currently we have a 3 node cluster running 6.2 version in GCP with no dedicated master nodes. All are identical in terms of configuration (4 vCPU, 15 GB RAM nodes - 7GB set as xmx for ES) and settings. We use org.elasticsearch.client.RestClient to access ES cluster. This setup has been running fine for about 2 months now. This morning we experienced some issues on our app server and on checking the logs, I could see that all operations (index and search) on ES were slow. Application that mainly creates data in the cluster accesses one of the nodes (say N1) and application that mainly searches (we have several instances of this application running on a cluster of 2-16 nodes) accesses another node (say N2). Thread dumps showed many threads waiting to get a connection to ES :
This raised doubts on whether one single node is being overburdened with all connection requests coming in. Hence I changed some of the search nodes to connect to third node (say N3) instead of N2. After this change, the situation improved and accessing data from ES became fast. I am not sure if this was the only reason or whether the load on our application server had reduced drastically by the time I figured this out and made the change. I feel this change would have made quite a difference. Hence I feel setting up a LB to distribute the load will be better.
I have read several posts in this forum as well as on SO about the necessity of load balancer in front of ES cluster and I see different responses in different posts :
a) Some say LB is not necessary
b) Some recommend to setup a LB and include only master nodes
c) Some recommend to setup two LBs - one for writing/indexing documents in ES and include only data nodes in this LB and another for queries and include only client nodes in this LB
What is the recommended way of setting up LB?