If you only have a few (e.g. 5) total nodes each doing master+data+http, is it really helpful to have a separate non-data client node, or does this make sense only for larger numbers?
Personally, I don't think client nodes are worth it when you have small clusters. At small cluster sizes, it tends to be better to just add another data node with that available machine, since it gives you more horsepower for your cluster.
The main advantages of client nodes are:
- Slightly reduced memory pressure on data nodes, since the client offloads search / agg reductions
- "Smart Routing", since they know where all data lives, they can avoid an extra hop
- Architecturally, it can be useful to use Client nodes as your access point to the cluster, so your app doesn't need to know the details
What's the typical ratio of the different node types (e.g. 1 client node for every 3 data nodes?)
Really hard to say. I've seen as low as 3 client : 10 data, up to 5:50, up to 50:60. The people who are running high percentages (e.g. 50:60) are typically Java applications that are embedding the NodeClient directly into their application. Most RESTful clusters are lower.
What's the minimum # of each node type that you want, for high availability?
For dedicated masters, you need at least 3. Any less is a waste. Adding more just increases the availability of your master pool. E.g. 5 dedicated masters means you can lose two master-eligible nodes and still keep your cluster up. They aren't really needed from a performance perspective, since only one of those nodes will be master at any particular time.
Data nodes is impossible to say, totally depends on your search/indexing/storage requirements.
If I'm getting EsRejectedExecutionException[rejected execution (queue capacity 250) then how can I tell if splitting out data:false nodes separately would be helpful, or if I just need more capacity overall?
So what this means is that you are bottlenecking on that queue. If those are Indexing-related exceptions, it means you are trying to index faster than your cluster can tolerate at the moment, and the queue is filling up. Queues are basically backpressure telling your app to slow down. If it is indexing related, you might get some tips from this article: Performance Considerations for Indexing in Elasticsearch
If those exceptions are search-related, it means your sending too many search requests. It's usually indexing-related though.
Honestly, I doubt client nodes will help much in this case. Bottleneck on indexing usually means you are indexing as fast as your throttle settings will allow (or as fast as your disks will allow). So you either need to increase those settings (and potentially hurt search performance) or just add more capacity.