I am relatively new to ES. We are currently running an ES cluster with 3
Nodes and the default sharding strategy.
I started now to write a Plugin which will do some basic maintenance,
update some data here and there and set a flag to true or false.
I created a service and registered it. The constructor gets a Client object
injected.
Then I regularly fetch data, process them and eventually put them back to
index.
My question: Is this okay for the whole cluster? I mean, will the changes
that I make with the Client object get propagated to the other nodes and
the replica shards or do I have to make my plugin shard- and cluster
agnostic?
If you work with a Client object, whatever implementation it is, you are
sending normal requests that affect the whole cluster. What happens is that
you don't need to send your requests to a node, since your code is running
already within an elasticsearch node, which will decide where your requests
need to be sent and what to do with those.
Cheers
Luca
On Wednesday, September 11, 2013 9:13:18 PM UTC+2, Jan-Hendrik Lendholt
wrote:
Hi,
I am relatively new to ES. We are currently running an ES cluster with 3
Nodes and the default sharding strategy.
I started now to write a Plugin which will do some basic maintenance,
update some data here and there and set a flag to true or false.
I created a service and registered it. The constructor gets a Client
object injected.
Then I regularly fetch data, process them and eventually put them back to
index.
My question: Is this okay for the whole cluster? I mean, will the changes
that I make with the Client object get propagated to the other nodes and
the replica shards or do I have to make my plugin shard- and cluster
agnostic?
I got another question you might be able to answer:
Let's say I've got 3 machines, one of them being master and 2 being
replicas.
I install my plugin on all three machines. The plugin code gets executed
once every 60 seconds.
If the plugin gets called on the replica first (and on the master second)
and updates data, will it be propagated to the other replica and to the
master as well?
I am afraid loosing data or doing the work twice.
you might be doing the work thrice in this setup. You could check, if the
plugin is called on the master node, and if not, simply do nothing (this is
a simple solution and might put some additional load on your master, not
sure how big your maintenance tasks are, you might need to find a different
solution in your case).
I got another question you might be able to answer:
Let's say I've got 3 machines, one of them being master and 2 being
replicas.
I install my plugin on all three machines. The plugin code gets executed
once every 60 seconds.
If the plugin gets called on the replica first (and on the master second)
and updates data, will it be propagated to the other replica and to the
master as well?
I am afraid loosing data or doing the work twice.
I implemented LocalNodeMasterListener which makes sure, that the plugin
only executes the maintenance tasks on the master node.
If the master not gets degraded and another node gets promoted, the plugin
will start to work on the new master node.
One short question: The maintenance task selects & updates several
documents, up to a few thousands. Which ExecutorName should I take from ThreadPool.Name?
I choose .INDEX, or should I rather choose SEARCH?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.