Java Plugin: Data update correctly implemented?

Jan_Hendrik_Lendholt · September 11, 2013, 7:13pm

Hi,

I am relatively new to ES. We are currently running an ES cluster with 3
Nodes and the default sharding strategy.

I started now to write a Plugin which will do some basic maintenance,
update some data here and there and set a flag to true or false.
I created a service and registered it. The constructor gets a Client object
injected.

Then I regularly fetch data, process them and eventually put them back to
index.

My question: Is this okay for the whole cluster? I mean, will the changes
that I make with the Client object get propagated to the other nodes and
the replica shards or do I have to make my plugin shard- and cluster
agnostic?

Thanks for a short reply

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

javanna · September 12, 2013, 11:32am

If you work with a Client object, whatever implementation it is, you are
sending normal requests that affect the whole cluster. What happens is that
you don't need to send your requests to a node, since your code is running
already within an elasticsearch node, which will decide where your requests
need to be sent and what to do with those.

Cheers
Luca

On Wednesday, September 11, 2013 9:13:18 PM UTC+2, Jan-Hendrik Lendholt
wrote:

Hi,

I am relatively new to ES. We are currently running an ES cluster with 3
Nodes and the default sharding strategy.

I started now to write a Plugin which will do some basic maintenance,
update some data here and there and set a flag to true or false.
I created a service and registered it. The constructor gets a Client
object injected.

Then I regularly fetch data, process them and eventually put them back to
index.

My question: Is this okay for the whole cluster? I mean, will the changes
that I make with the Client object get propagated to the other nodes and
the replica shards or do I have to make my plugin shard- and cluster
agnostic?

Thanks for a short reply

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jan_Hendrik_Lendholt · September 13, 2013, 5:53am

Hi Luca,

thanks for your answer That sounds good.

I got another question you might be able to answer:

Let's say I've got 3 machines, one of them being master and 2 being
replicas.
I install my plugin on all three machines. The plugin code gets executed
once every 60 seconds.

If the plugin gets called on the replica first (and on the master second)
and updates data, will it be propagated to the other replica and to the
master as well?
I am afraid loosing data or doing the work twice.

Thanks

Jan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · September 13, 2013, 7:14am

Hey,

you might be doing the work thrice in this setup. You could check, if the
plugin is called on the master node, and if not, simply do nothing (this is
a simple solution and might put some additional load on your master, not
sure how big your maintenance tasks are, you might need to find a different
solution in your case).

Sample code might be here:

github.com

spinscale/elasticsearch-graphite-plugin/blob/master/src/main/java/org/elasticsearch/service/graphite/GraphiteService.java#L78-L94


      
              }
              if (graphiteReporterThread != null) {
                  graphiteReporterThread.interrupt();
              }
              closed = true;
              logger.info("Graphite reporter stopped");
          }
          
          @Override
          protected void doClose() throws ElasticsearchException {}
          
          public class GraphiteReporterThread implements Runnable {
          
              private final Pattern graphiteInclusionRegex;
              private final Pattern graphiteExclusionRegex;
          
              public GraphiteReporterThread(Pattern graphiteInclusionRegex, Pattern graphiteExclusionRegex) {

--Alex

On Fri, Sep 13, 2013 at 7:53 AM, Jan-Hendrik Lendholt <
jan.lendholt@gmail.com> wrote:

Hi Luca,

thanks for your answer That sounds good.

I got another question you might be able to answer:

Let's say I've got 3 machines, one of them being master and 2 being
replicas.
I install my plugin on all three machines. The plugin code gets executed
once every 60 seconds.

If the plugin gets called on the replica first (and on the master second)
and updates data, will it be propagated to the other replica and to the
master as well?
I am afraid loosing data or doing the work twice.

Thanks

Jan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jan_Hendrik_Lendholt · September 13, 2013, 9:10am

Hi Alex,

thanks for the quick reply

I managed it a little bit different:

I implemented LocalNodeMasterListener which makes sure, that the plugin
only executes the maintenance tasks on the master node.
If the master not gets degraded and another node gets promoted, the plugin
will start to work on the new master node.

One short question: The maintenance task selects & updates several
documents, up to a few thousands. Which ExecutorName should I take from ThreadPool.Name?
I choose .INDEX, or should I rather choose SEARCH?

Thank you guys!

Jan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Plugins installed in a cluster of ES Elasticsearch	6	340	July 6, 2017
Java client or plugin for custom indexer and searcher Elasticsearch	4	566	July 6, 2017
Plugin broadcasting/scalability Elasticsearch	3	426	July 6, 2017
Java API, is a client thread safe? Elasticsearch	3	2527	July 6, 2017
Best practices for extending TransportBroadcastOperationAction Elasticsearch	3	754	July 6, 2017

Java Plugin: Data update correctly implemented?

Related topics