Recommendations around cluster sizes for differentiating node types

I've seen http://www.elastic.co/guide/en/elasticsearch/reference/1.5/modules-node.html and http://stackoverflow.com/a/27871775/137067 but I was hoping for some rough advice on when splitting out nodes to be specialized (as data-only or client or dedicated master) actually makes sense.

  • If you only have a few (e.g. 5) total nodes each doing master+data+http, is it really helpful to have a separate non-data client node, or does this make sense only for larger numbers?
  • What's the typical ratio of the different node types (e.g. 1 client node for every 3 data nodes?)
  • What's the minimum # of each node type that you want, for high availability?
  • If I'm getting EsRejectedExecutionException[rejected execution (queue capacity 250) then how can I tell if splitting out data:false nodes separately would be helpful, or if I just need more capacity overall?

It would be helpful to see a few different overall architecture examples and how people typically should be scaling up. e.g.:

  • For the smallest level of high availability, start with at least 3 nodes. At least 2 of them data:true and all 3 should be master:true.
  • (Then examples of the next recommended "steps" for continued scaling. e.g. if you have 5 or 10 or 30 data nodes what's a likely starting place for # of nodes by type).

If you only have a few (e.g. 5) total nodes each doing master+data+http, is it really helpful to have a separate non-data client node, or does this make sense only for larger numbers?

Personally, I don't think client nodes are worth it when you have small clusters. At small cluster sizes, it tends to be better to just add another data node with that available machine, since it gives you more horsepower for your cluster.

The main advantages of client nodes are:

  • Slightly reduced memory pressure on data nodes, since the client offloads search / agg reductions
  • "Smart Routing", since they know where all data lives, they can avoid an extra hop
  • Architecturally, it can be useful to use Client nodes as your access point to the cluster, so your app doesn't need to know the details

What's the typical ratio of the different node types (e.g. 1 client node for every 3 data nodes?)

Really hard to say. I've seen as low as 3 client : 10 data, up to 5:50, up to 50:60. The people who are running high percentages (e.g. 50:60) are typically Java applications that are embedding the NodeClient directly into their application. Most RESTful clusters are lower.

What's the minimum # of each node type that you want, for high availability?

For dedicated masters, you need at least 3. Any less is a waste. Adding more just increases the availability of your master pool. E.g. 5 dedicated masters means you can lose two master-eligible nodes and still keep your cluster up. They aren't really needed from a performance perspective, since only one of those nodes will be master at any particular time.

Data nodes is impossible to say, totally depends on your search/indexing/storage requirements.

If I'm getting EsRejectedExecutionException[rejected execution (queue capacity 250) then how can I tell if splitting out data:false nodes separately would be helpful, or if I just need more capacity overall?

So what this means is that you are bottlenecking on that queue. If those are Indexing-related exceptions, it means you are trying to index faster than your cluster can tolerate at the moment, and the queue is filling up. Queues are basically backpressure telling your app to slow down. If it is indexing related, you might get some tips from this article: Performance Considerations for Indexing in Elasticsearch

If those exceptions are search-related, it means your sending too many search requests. It's usually indexing-related though. :wink:

Honestly, I doubt client nodes will help much in this case. Bottleneck on indexing usually means you are indexing as fast as your throttle settings will allow (or as fast as your disks will allow). So you either need to increase those settings (and potentially hurt search performance) or just add more capacity.

2 Likes

Thanks a lot for the helpful response.

Yes, the occasional queue exceeded warnings were from indexing operations.

Based on your answer it sounds like I should just add more capacity but keep them all as [node.master=true, node.data=true, http.enabled=true]. Roughly at what # of nodes do you think we should reconsider this and break out into separate data, client, and dedicated masters?

If I'm running 5 i2.xlarge EC2 instances currently would you tend toward adding more of the same or run the same or fewer of larger instances (e.g. i2.2xlarge or i2.4xlarge)? From your posted article I see that ES can easily use multiple data folders to take advantage of the separate SSDs. I know I should have at least 3 nodes, so any advantage of having 8-9 smaller nodes versus e.g. 3 large ones?

I start considering dedicated master nodes when:

  • Cluster size starts to get unruly...maybe like 10 nodes or higher?
  • You see cluster instability due to load (usually caused by memory pressure, which causes long GCs, which causes master to drop out of cluster temporarily)

The main purpose of dedicated master nodes is to keep the "master responsibilities" isolated from load which can cause long GCs, and thus cluster instability. That way an overloaded cluster will only bounce data nodes (still not good, obviously), but the master will remain stable. Basically dedicated masters give you breathing room if your cluster overloads, instead of the whole thing starting to melt down.

Also, as an aside, master nodes can usually be quite "light" compared to your data nodes. Maybe a few gb of RAM, moderate CPU, normal disks, etc.

Generally, medium-to-large nodes are less hassle than a bunch of smaller nodes. They are logistically simpler to work with, and ES will happily use all the resources available. The only "gotcha" is to avoid boxes with more than 64gb of RAM. The recommendation is to give half your memory to the OS, and to avoid going over ~31gb of Heap due to compressed object pointers in the JVM (and because the CMS GC get's cranky at large heap sizes). So boxes larger than 64gb tend to get tricker to manage.

If you use 64gb+ boxes, you have to start running multiple ES nodes per box, making sure primaries/replicas don't allocate to same machine, worry about splitting disk IO fairly, might have to tweak threadpools, etc. Just a lot more hassle.

I'm not an EC2 expert, but those i2.xlarges look to be in a nice range.

Btw, feel free to modify some of the throttling settings. The disk throttling in particular is very conservative, and given your node's hardware, you can probably increase it a good amount and still not see any impact on search latency. Also make sure your kernel's IO scheduler is configured with noop or deadline since you are using SSDs (the default is usually cfq, which is optimized for rotational media)

5 Likes