i wonder what is the hardware recommendation for the dedicated client node?
i know master is a very light weight node that doesn't require good
hardware, but how about client? it's saying the client node is doing the
actual gather processing, so i assume it might require more memory like
data node. am i right? any recommendation would be greatly appreciated.
can anyone suggest the hardware recommendation for the dedicated client
node? thanks.
On Friday, October 24, 2014 6:32:26 PM UTC-7, Terence Tung wrote:
hi there,
i wonder what is the hardware recommendation for the dedicated client
node? i know master is a very light weight node that doesn't require good
hardware, but how about client? it's saying the client node is doing the
actual gather processing, so i assume it might require more memory like
data node. am i right? any recommendation would be greatly appreciated.
I don't use client nodes so I can't speak from experience here. Most of
the gathering steps I can think of amount to merging sorted lists which
isn't particularly intense. I think aggregations (another thing I don't
use) can be more intense at the client node but I'm not sure.
My recommendation is to start by sending requests directly to the data
nodes and only start to investigate client nodes if you have trouble with
that and diagnose that trouble as being something that'd move to a client
node if you had them. Its a nice thing to have in your back pocket but it
just hasn't come up for me.
I have dedicated client nodes for some really intense queries and
aggregations. Clients typically have 2GB of heap. Our experience is that
2GB of Heap is sufficient, the client node doesn't do a whole lot. The bulk
of the work is done on the data nodes.
cheers
mike
On Monday, November 10, 2014 11:24:41 AM UTC-5, Nikolas Everett wrote:
I don't use client nodes so I can't speak from experience here. Most of
the gathering steps I can think of amount to merging sorted lists which
isn't particularly intense. I think aggregations (another thing I don't
use) can be more intense at the client node but I'm not sure.
My recommendation is to start by sending requests directly to the data
nodes and only start to investigate client nodes if you have trouble with
that and diagnose that trouble as being something that'd move to a client
node if you had them. Its a nice thing to have in your back pocket but it
just hasn't come up for me.
Nik
On Mon, Nov 10, 2014 at 11:17 AM, Terence Tung <ter...@teambanjo.com
<javascript:>> wrote:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.