Choosing between client nodes and data nodes

altaf · May 17, 2016, 6:30pm

Hi. We have a 3 node Elasticsearch 1.7.5 cluster that has been staying at 85-89% JVM heap usage. Each node is has 16GB of RAM allocated to Elasticsearch. Due to the high JVM heap usage, we’re planning on expanding our cluster, but we’re not sure whether we should add additional data nodes or instead add dedicated client node(s). We do perform some fairly heavy-weight aggregations. Is there a way to determine whether we’ll get the best performance improvement from just increasing the cluster versus adding client nodes?

warkolm · May 18, 2016, 5:32am

All client nodes do is handle the reduction phase of search or the aggregation calculations.
Chances are adding clients nodes to such a small cluster won't be worth the same value as adding more data nodes.

altaf · May 18, 2016, 3:38pm

Thanks for the help Mark. After looking more in depth at our heap usage in Marvel, it looks like the JVM heap usage spikes we were seeing all relate to our field data size increasing significantly due to large queries. We're going to add additional data nodes.

warkolm · May 18, 2016, 3:38pm

Look into doc values too.

Topic		Replies	Views
Elasticsearch client nodes Elasticsearch	2	890	July 5, 2017
Hardware requirements for client and master-only nodes Elasticsearch	5	12842	July 5, 2017
Elasticsearch cluster - using client node Elasticsearch	2	789	July 5, 2017
Requirements per node role Elasticsearch	8	2249	July 6, 2017
Memory footprint Elasticsearch	9	785	July 6, 2017

Choosing between client nodes and data nodes

Related topics