Cache.size and breaker limits on client-only nodes vs. data-only nodes

Hi again all,

As the subject indicates, I'm splitting the roles of my cluster into a single role per JVM. In doing so, I'm re-evaluating all my settings in my elasticsearch.yml file to make sure what I have is appropriate for the different node roles.

I'm not sure what I should have with regards to these values between client-only and data-only nodes, based on what activities happens on each:


Are some appropriate for client-only, and some for data-only? Or is there a reason to specify them all on both node types?

I believe I understand correctly that indexing happens on the data-only nodes, but the indexing requests themselves go to the client nodes first, and searching happens in the client-only nodes. So I'm not sure how to make best use of my heap on each of those node types.

I'm thinking I want to maximize all "search related" breakers/cache.size on the clients, and maximize indexing-related settings on the data nodes.

Perhaps something like:
Data-only: 70% (the default)
indices.memory.index_buffer_size:  30% 


indices.fielddata.cache.size: 60% 75%
indices.breaker.request.limit: 50%
indices.breaker.fielddata.limit: 65%

I'd definitely appreciate some advice on this one. Am I close to on-track? :smile:
Thanks very much!

This will depend on your datasets really, but it'd be worth testing.

Don't also forget that you can set heap to >50% of heap for client nodes, as they don't need to worry about any FS caching. We'd suggest not going beyond 75% though.