Using client nodes for reducing memory pressure from data nodes

Hi ,
ES version 1.7.5 .

We have high memory pressure in our data nodes mainly by segments memory ( around 65% of the memory ) .
This left enough very small memory for queris and we keep getting OLD GC's .

We considure using client nodes for reducing the pressure from the data nodes .

Our querys main use cases are :

  1. pulling querys - its a simple querys that pull 5000 events from ES , order by desc by one of the the columns that we have on the events index .
  2. We have some aggrigations .

We add one client node to our prod env , and run most of our use cases querys on top of it , when i monitor the JVM utilization of the client node during query execution , i see very minor utilization per query comparing to what i see when we run it directly on one of the data node . Its seems like the client node do very small part of the execution . For the first use case I would except to see some memory utilization as it should bring all the data from the data nodes and perform and sort it localy on the client node , but the utiliztion is so small , so i guess its run on one of the data nodes , and we get the end result ( our assumtion was that client node suppose to act as reducer node that get the data from the data nodes and do the sort locay )

Any idea if client nodes is good for our use case , or when client nodes will be useful

Thanks

Dedicated client nodes take some load away from the data nodes, e.g. request parsing and collection of results from the data nodes. To what extent it will help depends on your query types and patterns. If it does not help much in your use case, it may be better to try and address the main source of the problem, which based on your description seems to be segment memory.

How much data do you have in the cluster? How many indices and shards do you have in the cluster? What is the size and specification of the cluster?

where the sort suppose to run ? on the data node or on the data node (On our first use case )

"collection of results from the data nodes." - can you elborate more on this part ?

And Thanks for helping

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.