Plugin questions (rolling index)

Hey,

I created a (not yet working) plugin for a time-dependent rolling
index which utilizes my old code of issue
1500: https://github.com/karussell/es-rollplugin

I would like to create a periodic checking-thread.

  1. How would I create a singleton thread (only one per cluster)?
  2. Also '@Inject private RollAction action;' does not work in a plugin.
    How can I inject an instance of RollAction into this singleton thread?
  3. at which central place can I put meta information to store that an
    index creation was successful?

Or would you organize the flow in a different manner? In the issue someone
mentioned that instead of the very specific time-dependent use case I have
in mind he would trigger a more generic checking-method ala
'shouldIRollTheIndex' every time documents are feeded. Wouldn't this affect
performance in a bad manner or could this work?

Ideas?

Regards,
Peter.

--

Peter,

I think this is really great idea.

  1. That sounds like a river to me.
  2. Where are you trying to inject it?
  3. If you implement it as river, the _river index will be a logical place
    to store this information.

Igor

On Wednesday, October 24, 2012 2:55:58 PM UTC-4, Karussell wrote:

Hey,

I created a (not yet working) plugin for a time-dependent rolling
index which utilizes my old code of issue 1500:
https://github.com/karussell/es-rollplugin

I would like to create a periodic checking-thread.

  1. How would I create a singleton thread (only one per cluster)?
  2. Also '@Inject private RollAction action;' does not work in a plCugin.
    How can I inject an instance of RollAction into this singleton thread?
  3. at which central place can I put meta information to store that an
    index creation was successful?

Or would you organize the flow in a different manner? In the issue someone
mentioned that instead of the very specific time-dependent use case I have
in mind he would trigger a more generic checking-method ala
'shouldIRollTheIndex' every time documents are feeded. Wouldn't this affect
performance in a bad manner or could this work?

Ideas?

Regards,
Peter.

--

Hey Igor,

thanks! I'll try this again as a river (always thought this is related to
external data stuff but I get what you mean:
the 'rolling river' (how can that be ;)) is a singleton in the cluster and
'waits for some external events' ...

  1. Where are you trying to inject it?

Into the singleton thread. which is created once in the plugin. So I need
an instance of the action in the plugin ...

... ah, I think elasticsearch is using constructor injection. Bad me. I was
used to variable injection... I'll try that also again.

Regards,
Peter.

On Wednesday, October 24, 2012 9:26:20 PM UTC+2, Igor Motov wrote:

Peter,

I think this is really great idea.

  1. That sounds like a river to me.
  2. Where are you trying to inject it?
  3. If you implement it as river, the _river index will be a logical place
    to store this information.

Igor

On Wednesday, October 24, 2012 2:55:58 PM UTC-4, Karussell wrote:

Hey,

I created a (not yet working) plugin for a time-dependent rolling
index which utilizes my old code of issue 1500:
https://github.com/karussell/es-rollplugin

I would like to create a periodic checking-thread.

  1. How would I create a singleton thread (only one per cluster)?
  2. Also '@Inject private RollAction action;' does not work in a plCugin.
    How can I inject an instance of RollAction into this singleton thread?
  3. at which central place can I put meta information to store that an
    index creation was successful?

Or would you organize the flow in a different manner? In the issue
someone mentioned that instead of the very specific time-dependent use case
I have in mind he would trigger a more generic checking-method ala
'shouldIRollTheIndex' every time documents are feeded. Wouldn't this affect
performance in a bad manner or could this work?

Ideas?

Regards,
Peter.

--

Hi Peter,

looks like you want to build a dispatcher service, not a river. Just a
hint, if you are keen at building something similar to the river service
itself: create a service on a node, and register for cluster state
listening with a ClusterStateListener (like RoutingService does for
example) at the cluster service. A central place to put custom metadata is
in the cluster metadata, probably by using the putCustom() method in the
builder, see org.elasticsearch.cluster.metadata.Metadata class.
When you receive a ClusterChangedEvent, then you could lookup cluster-wide
objects in a map via event.state().metaData().customs(). YMMV because it is
very roughly documented.

Cheers, Jörg

On Wednesday, October 24, 2012 8:55:58 PM UTC+2, Karussell wrote:

Hey,

I created a (not yet working) plugin for a time-dependent rolling
index which utilizes my old code of issue 1500:
https://github.com/karussell/es-rollplugin

I would like to create a periodic checking-thread.

  1. How would I create a singleton thread (only one per cluster)?
  2. Also '@Inject private RollAction action;' does not work in a plugin.
    How can I inject an instance of RollAction into this singleton thread?
  3. at which central place can I put meta information to store that an
    index creation was successful?

Or would you organize the flow in a different manner? In the issue someone
mentioned that instead of the very specific time-dependent use case I have
in mind he would trigger a more generic checking-method ala
'shouldIRollTheIndex' every time documents are feeded. Wouldn't this affect
performance in a bad manner or could this work?

Ideas?

Regards,
Peter.

--

Hey Jörg and Igor,

to keep things simple I gave up the idea of a complete solution which would
require quartz or similar.

Now it is possible to call the rolling code via REST. See
https://github.com/karussell/es-rollplugin/blob/master/Readme.md

... and one can now easily create a cron job to create weekly or daily
rollforwards or use it while feeding etc.

The minor advantage of the implementation is that one you can call it
manually in aperiodic cycles.

Regards,
Peter.

--

The plugin is now at https://github.com/karussell/elasticsearch-rollindex
to make the installation simpler.

Peter.

BTW: Why do I need to install the plugin into every node ... ElasticSearch
could handle that too :wink: ?

--

Hi Peter,

thanks for your effort!

You won't need solutions like quartz, a ScheduledThreadPoolExecutor would
do in most cases.

I think, for production purposes, the easiest and most preferable method is
to roll out the ES binaries across the nodes with plugins already attached
before nodes are started up.

"Live" plugin proliferation over all nodes would be a nice feature, indeed.
But there are some challenges. Plugins need full node restarts.

Beside this, plugins would really love to get some more useful features.
For example: identifying and versioning plugins, enabling/disabling
plugins, plugin target matching (validate if with the underlying ES version
is permitted by the plugin author), dependency recognition between plugins
(evaluate a hierarchical tree of installed and activated plugins), and
integration of plugin status info into cluster/node stats - only to name a
few of my humblest wishes :wink: So volunteers, come all in and help out :slight_smile:

Jörg

On Thursday, October 25, 2012 2:10:07 PM UTC+2, Karussell wrote:

The plugin is now at https://github.com/karussell/elasticsearch-rollindexto make the installation simpler.

Peter.

BTW: Why do I need to install the plugin into every node ... Elasticsearch
could handle that too :wink: ?

--

You won't need solutions like quartz, a ScheduledThreadPoolExecutor would
do in most cases.

ah, ok. I'll look into that when I've more time.

I think, for production purposes, the easiest and most preferable method
is to roll out the ES binaries across the nodes with plugins already
attached before nodes are started up.

yes + for security reasons I've enabled that a node won't start without my
rollindex plugin :slight_smile: (I've found an option -> see the plugin docu)

Regards,
Peter.

--