Providing your own ThreadPool to node client


(Andrew Clegg) #1

What is the correct way to provide your own ThreadPool for the java
Node client to use?

I notice that NodeClient can take a ThreadPool in the constructor, but
the normal way to obtain a NodeClient is from node.client() so you
don't get the option to provide one.

Alternatively, how can I change the settings of the thread pool to
use?

My use case is that I am setting up HBase to push new data into
ElasticSearch at quite high volumes, but I want to be able to control
how many index calls happen in parallel.

Or does a single NodeClient push all requests through a single thread
anyway, no matter how many threads it's shared between?

Thanks,

Andrew.


(Andrew Clegg) #2

Sorry -- I think this question is moot. On re-reading these docs:

http://www.elasticsearch.org/guide/reference/java-api/index_.html

it seems the threading only applies when the Node you got the Client
from actually contains the index shard.

In our case this won't be true, we'll be using a client-only node
sending to a remote shard.

I guess this means calls happen synchronously, so we'll need to handle
background writes ourselves, right?

On May 11, 10:12 am, Andrew Clegg andrew.cl...@gmail.com wrote:

What is the correct way to provide your own ThreadPool for the java
Node client to use?

I notice that NodeClient can take a ThreadPool in the constructor, but
the normal way to obtain a NodeClient is from node.client() so you
don't get the option to provide one.

Alternatively, how can I change the settings of the thread pool to
use?

My use case is that I am setting up HBase to push new data into
ElasticSearch at quite high volumes, but I want to be able to control
how many index calls happen in parallel.

Or does a single NodeClient push all requests through a single thread
anyway, no matter how many threads it's shared between?

Thanks,

Andrew.


(Shay Banon) #3

You can control if the call will be async or not based on the API (its
always "async" in a sense, to you can control if you want to wait for it or
not). You can still control the thread pools on the actual data nodes in
the cluster ofcourse using the thread pool configuration:
http://www.elasticsearch.org/guide/reference/modules/threadpool.html.

On Fri, May 11, 2012 at 12:19 PM, Andrew Clegg andrew.clegg@gmail.comwrote:

Sorry -- I think this question is moot. On re-reading these docs:

http://www.elasticsearch.org/guide/reference/java-api/index_.html

it seems the threading only applies when the Node you got the Client
from actually contains the index shard.

In our case this won't be true, we'll be using a client-only node
sending to a remote shard.

I guess this means calls happen synchronously, so we'll need to handle
background writes ourselves, right?

On May 11, 10:12 am, Andrew Clegg andrew.cl...@gmail.com wrote:

What is the correct way to provide your own ThreadPool for the java
Node client to use?

I notice that NodeClient can take a ThreadPool in the constructor, but
the normal way to obtain a NodeClient is from node.client() so you
don't get the option to provide one.

Alternatively, how can I change the settings of the thread pool to
use?

My use case is that I am setting up HBase to push new data into
ElasticSearch at quite high volumes, but I want to be able to control
how many index calls happen in parallel.

Or does a single NodeClient push all requests through a single thread
anyway, no matter how many threads it's shared between?

Thanks,

Andrew.


(system) #4