Java Plugin: Data update correctly implemented?


(Jan-Hendrik Lendholt) #1

Hi,

I am relatively new to ES. We are currently running an ES cluster with 3
Nodes and the default sharding strategy.

I started now to write a Plugin which will do some basic maintenance,
update some data here and there and set a flag to true or false.
I created a service and registered it. The constructor gets a Client object
injected.

Then I regularly fetch data, process them and eventually put them back to
index.

My question: Is this okay for the whole cluster? I mean, will the changes
that I make with the Client object get propagated to the other nodes and
the replica shards or do I have to make my plugin shard- and cluster
agnostic?

Thanks for a short reply :slight_smile:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Cavanna) #2

If you work with a Client object, whatever implementation it is, you are
sending normal requests that affect the whole cluster. What happens is that
you don't need to send your requests to a node, since your code is running
already within an elasticsearch node, which will decide where your requests
need to be sent and what to do with those.

Cheers
Luca

On Wednesday, September 11, 2013 9:13:18 PM UTC+2, Jan-Hendrik Lendholt
wrote:

Hi,

I am relatively new to ES. We are currently running an ES cluster with 3
Nodes and the default sharding strategy.

I started now to write a Plugin which will do some basic maintenance,
update some data here and there and set a flag to true or false.
I created a service and registered it. The constructor gets a Client
object injected.

Then I regularly fetch data, process them and eventually put them back to
index.

My question: Is this okay for the whole cluster? I mean, will the changes
that I make with the Client object get propagated to the other nodes and
the replica shards or do I have to make my plugin shard- and cluster
agnostic?

Thanks for a short reply :slight_smile:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jan-Hendrik Lendholt) #3

Hi Luca,

thanks for your answer :slight_smile: That sounds good.

I got another question you might be able to answer:

Let's say I've got 3 machines, one of them being master and 2 being
replicas.
I install my plugin on all three machines. The plugin code gets executed
once every 60 seconds.

If the plugin gets called on the replica first (and on the master second)
and updates data, will it be propagated to the other replica and to the
master as well?
I am afraid loosing data or doing the work twice.

Thanks :slight_smile:

Jan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #4

Hey,

you might be doing the work thrice in this setup. You could check, if the
plugin is called on the master node, and if not, simply do nothing (this is
a simple solution and might put some additional load on your master, not
sure how big your maintenance tasks are, you might need to find a different
solution in your case).

Sample code might be here:

--Alex

On Fri, Sep 13, 2013 at 7:53 AM, Jan-Hendrik Lendholt <
jan.lendholt@gmail.com> wrote:

Hi Luca,

thanks for your answer :slight_smile: That sounds good.

I got another question you might be able to answer:

Let's say I've got 3 machines, one of them being master and 2 being
replicas.
I install my plugin on all three machines. The plugin code gets executed
once every 60 seconds.

If the plugin gets called on the replica first (and on the master second)
and updates data, will it be propagated to the other replica and to the
master as well?
I am afraid loosing data or doing the work twice.

Thanks :slight_smile:

Jan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jan-Hendrik Lendholt) #5

Hi Alex,

thanks for the quick reply :slight_smile:

I managed it a little bit different:

I implemented LocalNodeMasterListener which makes sure, that the plugin
only executes the maintenance tasks on the master node.
If the master not gets degraded and another node gets promoted, the plugin
will start to work on the new master node.

One short question: The maintenance task selects & updates several
documents, up to a few thousands. Which ExecutorName should I take from ThreadPool.Name?
I choose .INDEX, or should I rather choose SEARCH?

Thank you guys!

Jan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6