We have a ES cluster of 10 data nodes with additional 5 master nodes. We are exploring the idea of adding co-ordinating node to the cluster to help with query performance, especially aggregation queries.
The data stored in the ES cluster is mostly log type of data.
I read a lot of documentation around the co-ordinating nodes but I am still unclear on the following points:
Does adding co-ordinating nodes help in a cluster of our size?
Currently, all the query and the indexing traffic goes to data nodes (which are behind nginx load balancer). The plan is to add the co-ordinating nodes to the nginx LB as well.
Would I need to remove the data nodes from the LB to make sure that indexing/querying traffic only goes through the co-ordinating nodes? Or can I still keep the data and co-ordinating nodes as part of the LB? The reason is, if there is a failure on the co-ordinating nodes, we still have the data nodes to handle the indexing/querying traffic and we might not experience application downtime.
- I assume we would need more than co-ordinating node. What is the ideal number to support the cluster size (The configuration of the co-ordinating node in terms of compute, memory and network would be similar to that of the data nodes).
Any guidance around this is highly appreciated.