I'm writing my first river that pulls data from several relational
databases, and I'm wondering how to best schedule periodic indexing tasks.
There is this nice ThreadPool class in ES that already has everything:
schedulers, executors of various kinds, etc. Yet it seems that most
rivers create their own threads or executors.
So is ThreadPool considered an internal API, or is it safe to use it to
schedule tasks in a river?
What is an internal API? What is safe use? It depends. ES thread pools are
working but are always subject of change for improvement. In case ES may
change thread pool code and you can not migrate your API usage, you can
always copy old working thread pool code into your plugin. I also see a
challenge since there is no "stability policy for API change" yet, but I
hope in the future (after 1.00) methods will be marked "deprecated" and not
removed or changed that fast from one version to the other.
Jörg
On Tuesday, May 28, 2013 9:32:18 PM UTC+2, Sylvain Wallez wrote:
So is ThreadPool considered an internal API, or is it safe to use it to
schedule tasks in a river?
Sorry, by "safe" I was actually wondering if there are some risks of
breaking the normal function of ES by using threads that are dedicated to
internal tasks and sepecifically tuned for this, e.g. for small duration
tasks, compared to the time spent waiting for a database to execute a
complex query.
Sylvain
On Tuesday, May 28, 2013 11:03:01 PM UTC+2, Jörg Prante wrote:
What is an internal API? What is safe use? It depends. ES thread pools are
working but are always subject of change for improvement. In case ES may
change thread pool code and you can not migrate your API usage, you can
always copy old working thread pool code into your plugin. I also see a
challenge since there is no "stability policy for API change" yet, but I
hope in the future (after 1.00) methods will be marked "deprecated" and not
removed or changed that fast from one version to the other.
Jörg
On Tuesday, May 28, 2013 9:32:18 PM UTC+2, Sylvain Wallez wrote:
So is ThreadPool considered an internal API, or is it safe to use it to
schedule tasks in a river?
I don't know what you are really after, but by using your own
service/thread pool via the methods in
org.elasticsearch.common.util.concurrent.EsExecutors you can't hardly
interfere with anything (unless you want to drown the VM with unbounded
thread pools but you don't, I'm sure).
Jörg
Am 29.05.13 00:07, schrieb Sylvain Wallez:
Sorry, by "safe" I was actually wondering if there are some risks of
breaking the normal function of ES by using threads that are dedicated
to internal tasks and sepecifically tuned for this, e.g. for small
duration tasks, compared to the time spent waiting for a database to
execute a complex query.
I was only trying to limit the number of thread collections living
independently in the system, and looking for advices on best practices.
But you're right. Since I'll only need a couple of additional threads,
EsExecutors sounds good.
Thanks!
Sylvain
On Wednesday, May 29, 2013 12:14:59 AM UTC+2, Jörg Prante wrote:
I don't know what you are really after, but by using your own
service/thread pool via the methods in
org.elasticsearch.common.util.concurrent.EsExecutors you can't hardly
interfere with anything (unless you want to drown the VM with unbounded
thread pools but you don't, I'm sure).
Jörg
Am 29.05.13 00:07, schrieb Sylvain Wallez:
Sorry, by "safe" I was actually wondering if there are some risks of
breaking the normal function of ES by using threads that are dedicated
to internal tasks and sepecifically tuned for this, e.g. for small
duration tasks, compared to the time spent waiting for a database to
execute a complex query.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.