We run a small 3-5 (autoscaled) node ElasticSearch cluster. We currently spun up a new 2.3.3 cluster that we will switch to when we index our current data set into it. I am taking this opportunity to evaluate our configuration and make sure it makes sense for us. On the topic of a data-less node used for routing, are there performance gains to be had with such a small cluster, or are the gains realized with a larger cluster? At what point does a node used for routing make more sense than using an HTTP proxy?
A data-less node is an extra hop and adds to overall latency. It sounds like you want to direct all traffic from clients to that node. Be careful to avoid double resource consumption. If you have complex and resource-consuming heavy queries like aggregations while indexing documents in large bulks at the same time, you can spread some of the index and query load to separate nodes, but then you should add more than one data-less node, because otherwise that will soon become a bottle neck. If you have small documents only and only light load at query side because of simple queries, there is no need of extra data-less nodes.
Not sure what you mean by routing. Is your application designed for document routing or shard routing? There is no advantage with a data-less node in both cases except it adds to network latency.
HTTP proxy is a different story, this is useful for clusters in private subnets or for authentication.
Thanks, I'm leaning towards we don't need one of these data-less nodes. By routing, I meant that I've read the data-less node (or client node) is "cluster aware" so it knows what shards and nodes the data being requested resides. I'm pretty new to the inner-workings of ElasticSearch, so I may be way off base (had a cluster dropped on my lap after one of our lead developers that used to manage it left). Like you said, I saw this as an extra "hop" and unnecessary as at least 2 out of the 3 nodes will have the shard with the data being requested, so I don't imagine such a small cluster benefiting from a data-less node.