[ANN] Elasticsearch Simple Action Plugin

Hi,

many of us want to start writing extensions for Elasticsearch.

Except submitting pull requests to the core code, one great advantage of
Elasticsearch is the plugin mechanism. Here, custom code can be hooked into
Elasticsearch, without having to ask for inclusion into the core code.
Nevertheless, plugin code can be published on Github and easily included
into a running ES instance by using the ES plugin command line tool.

Unfortunately, writing plugins is not so easy as it seems. There are many
plugins, some of them are very advanced, and finding a starting point for a
personal project could be quite hard.

Hence, for educational purposes, I wrote a tiny plugin, as a starting
point, to demonstrate how a plugin works.

The simple plugin is indeed very simple. It makes reuse of the standard
search action:

  • it defines a built-in query (a "match all" query)

  • it creates a custom action for it

  • the action is called from Java API

  • the result of the action (the search response of the "match all" query)
    is logged

The plugin code comes with a junit test. It is available at

In the hope it is useful,

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thank you. About the plugin, I wonder if you install it on all nodes in cluster to make it work or install it in a nondata node?

Hi Jörg,

Thanks a lot! Some month ago I started to write one plugin and it was
really difficult. This skeleton is perfect for a cold start :slight_smile:

On Tue, Jun 3, 2014 at 12:15 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Hi,

many of us want to start writing extensions for Elasticsearch.

Except submitting pull requests to the core code, one great advantage of
Elasticsearch is the plugin mechanism. Here, custom code can be hooked into
Elasticsearch, without having to ask for inclusion into the core code.
Nevertheless, plugin code can be published on Github and easily included
into a running ES instance by using the ES plugin command line tool.

Unfortunately, writing plugins is not so easy as it seems. There are many
plugins, some of them are very advanced, and finding a starting point for a
personal project could be quite hard.

Hence, for educational purposes, I wrote a tiny plugin, as a starting
point, to demonstrate how a plugin works.

The simple plugin is indeed very simple. It makes reuse of the standard
search action:

  • it defines a built-in query (a "match all" query)

  • it creates a custom action for it

  • the action is called from Java API

  • the result of the action (the search response of the "match all" query)
    is logged

The plugin code comes with a junit test. It is available at

GitHub - jprante/elasticsearch-simple-action-plugin: A simple action plugin for Elasticsearch

In the hope it is useful,

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Luiz Guilherme P. Santos

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAMdL%3DZEcM%3DzRmLk9%2BK7fiZ1hoT1zbcy_ScYT0F_8Kx0RzPkm%2BA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Usually, plugins that extend internal ES functionality should be installed
on all nodes. This is easy to remember and preferable from an
administrative view. All the nodes in the ES cluster must have access to
plugin code under all circumstances, especially when executing actions,
mappers, routers, discovery helpers, analyzer code, indexing helpers...

In this case, for the simple action demo plugin, you can install it just on
the node of the cluster where you want to execute the demo "match_all"
search from. The "match_all" search is then searching on all the indexes of
the cluster.

If you want to execute the demo plugin "match_all" search from other nodes,
you would have to install the plugin on those other nodes, too.

Jörg

On Tue, Jun 3, 2014 at 7:33 PM, virgil virgilxie@gmail.com wrote:

Thank you. About the plugin, I wonder if you install it on all nodes in
cluster to make it work or install it in a nondata node?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4056981.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401816815837-4056981.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFkQ4xL5xqrXfH%2BzwSJ0WvyLcNw6S%2B7Z0pTSCuwXHnXWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thank you Jörg. I see the point. But if the plugin consumes memory (ex. hold a HashMap for customized score), installing it on all nodes may waste memory of the cluster. Is there any way to deal with this issue?

Not sure if I understand your concern completely - as long as you're doing
things right in your code, it should be possible to allocate resources only
when required - this holds also for plugins.

Jörg

On Tue, Jun 3, 2014 at 11:48 PM, virgil virgilxie@gmail.com wrote:

Thank you Jörg. I see the point. But if the plugin consumes memory (ex.
hold
a HashMap for customized score), installing it on all nodes may waste
memory
of the cluster. Is there any way to deal with this issue?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057003.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401832124609-4057003.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFpp8ybSWqxya9s-hK6qhhBUOTno8pdgX-Ba6YT_oAHFA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

The problem is that only one copy of HashMap is needed to customize score of all documents in the cluster. But as we have to install the plugin on all nodes, the actual memory used is multiplied by the number of nodes in cluster. I try to figure out one way to save the memory. Tried on non-data node, but it seems not working.

You need resources on all nodes that hold shards, you can not do it with
just one instance, because ES index is distributed. Rescoring would be very
expensive if you did it on an extra central instance with an extra
scatter/gather phase. It is also very expensive in scripting.

A better method is a similarity plugin like

Not sure how your code looks like though, maybe you can share it with the
community?

Jörg

On Wed, Jun 4, 2014 at 2:55 AM, virgil virgilxie@gmail.com wrote:

The problem is that only one copy of HashMap is needed to customize score
of
all documents in the cluster. But as we have to install the plugin on all
nodes, the actual memory used is multiplied by the number of nodes in
cluster. I try to figure out one way to save the memory. Tried on non-data
node, but it seems not working.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057015.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401843345821-4057015.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH%3D228Y2PvB265Hs4NX1O_Ac4QBuWXJGcCqKaXFc3a56A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Sorry, the plugin is outdated, a better start is by looking at

Jörg

On Wed, Jun 4, 2014 at 10:07 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

You need resources on all nodes that hold shards, you can not do it with
just one instance, because ES index is distributed. Rescoring would be very
expensive if you did it on an extra central instance with an extra
scatter/gather phase. It is also very expensive in scripting.

A better method is a similarity plugin like
GitHub - tlrx/elasticsearch-custom-similarity-provider: A custom SimilarityProvider example for Elasticsearch

Not sure how your code looks like though, maybe you can share it with the
community?

Jörg

On Wed, Jun 4, 2014 at 2:55 AM, virgil virgilxie@gmail.com wrote:

The problem is that only one copy of HashMap is needed to customize score
of
all documents in the cluster. But as we have to install the plugin on all
nodes, the actual memory used is multiplied by the number of nodes in
cluster. I try to figure out one way to save the memory. Tried on non-data
node, but it seems not working.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057015.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401843345821-4057015.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHZTAZrAdtQAnvj_7UtO%3DaAVtN3qt337PTzDjnbCmtPaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for the awesome example. Another question from me : If I wanted to take the search example and make another one there ( for example if I want to do paginated search ), could I have access to the search response inside the doExecute method ?

Jörg, thanks for the plugin to help as a starting point for plugin
development.

Although I have built a few plugins during the years, they were river or
analysis plugins, which are fairly easy. Writing a custom action required a
lot more digging, especially since there are very few to learn from. I
still would like to see a write-up regarding the different families of
transport actions: BroadcastOperationRequest,
MasterNodeOperationRequest, NodesOperationRequest,
SingleShardOperationRequest, SingleCustomOperationRequest,
etc. What is the difference? I understand it now, but it should be
documented. There is little documentation about the internals and there are
no code level comments. I always meant to experiment with the different
action hierarchies via simple plugins and document my findings. Perhaps one
day...

Cheers,

Ivan

On Wed, Jun 4, 2014 at 1:09 AM, joergprante@gmail.com <joergprante@gmail.com

wrote:

Sorry, the plugin is outdated, a better start is by looking at

Elasticsearch Platform — Find real-time answers at scale | Elastic

Jörg

On Wed, Jun 4, 2014 at 10:07 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

You need resources on all nodes that hold shards, you can not do it with
just one instance, because ES index is distributed. Rescoring would be very
expensive if you did it on an extra central instance with an extra
scatter/gather phase. It is also very expensive in scripting.

A better method is a similarity plugin like
GitHub - tlrx/elasticsearch-custom-similarity-provider: A custom SimilarityProvider example for Elasticsearch

Not sure how your code looks like though, maybe you can share it with the
community?

Jörg

On Wed, Jun 4, 2014 at 2:55 AM, virgil virgilxie@gmail.com wrote:

The problem is that only one copy of HashMap is needed to customize
score of
all documents in the cluster. But as we have to install the plugin on all
nodes, the actual memory used is multiplied by the number of nodes in
cluster. I try to figure out one way to save the memory. Tried on
non-data
node, but it seems not working.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057015.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401843345821-4057015.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHZTAZrAdtQAnvj_7UtO%3DaAVtN3qt337PTzDjnbCmtPaA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHZTAZrAdtQAnvj_7UtO%3DaAVtN3qt337PTzDjnbCmtPaA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCkOVMuEV67ZMCX5qoAdiob%2BfWsuWK%3D0EyAKf3VGhjYdQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Yeah, but I would consider the nondata node is already doing the job. -- "These "non data" nodes are still part of the cluster, and they redirect operations exactly to the node that holds the relevant data. The other benefit is the fact that for scatter / gather based operations (such as search), these nodes will take part of the processing since they will start the scatter process, and perform the actual gather processing." I just uploaded my native script code in https://github.com/virgil0/TestPlugin. It works with the function score query. You can see that there are 3 bin file I need to load into memory. Thank you for reply.

Absolutely, agreed.

The docs are sparse in my simple plugin too. I try to find some time to add
sample code for all the variants and explain the differences.

Jörg

On Wed, Jun 4, 2014 at 6:22 PM, Ivan Brusic ivan@brusic.com wrote:

Jörg, thanks for the plugin to help as a starting point for plugin
development.

Although I have built a few plugins during the years, they were river or
analysis plugins, which are fairly easy. Writing a custom action required a
lot more digging, especially since there are very few to learn from. I
still would like to see a write-up regarding the different families of
transport actions: BroadcastOperationRequest, MasterNodeOperationRequest, NodesOperationRequest, SingleShardOperationRequest, SingleCustomOperationRequest,
etc. What is the difference? I understand it now, but it should be
documented. There is little documentation about the internals and there are
no code level comments. I always meant to experiment with the different
action hierarchies via simple plugins and document my findings. Perhaps one
day...

Cheers,

Ivan

On Wed, Jun 4, 2014 at 1:09 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Sorry, the plugin is outdated, a better start is by looking at

Elasticsearch Platform — Find real-time answers at scale | Elastic

Jörg

On Wed, Jun 4, 2014 at 10:07 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

You need resources on all nodes that hold shards, you can not do it with
just one instance, because ES index is distributed. Rescoring would be very
expensive if you did it on an extra central instance with an extra
scatter/gather phase. It is also very expensive in scripting.

A better method is a similarity plugin like
GitHub - tlrx/elasticsearch-custom-similarity-provider: A custom SimilarityProvider example for Elasticsearch

Not sure how your code looks like though, maybe you can share it with
the community?

Jörg

On Wed, Jun 4, 2014 at 2:55 AM, virgil virgilxie@gmail.com wrote:

The problem is that only one copy of HashMap is needed to customize
score of
all documents in the cluster. But as we have to install the plugin on
all
nodes, the actual memory used is multiplied by the number of nodes in
cluster. I try to figure out one way to save the memory. Tried on
non-data
node, but it seems not working.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057015.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401843345821-4057015.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHZTAZrAdtQAnvj_7UtO%3DaAVtN3qt337PTzDjnbCmtPaA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHZTAZrAdtQAnvj_7UtO%3DaAVtN3qt337PTzDjnbCmtPaA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCkOVMuEV67ZMCX5qoAdiob%2BfWsuWK%3D0EyAKf3VGhjYdQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCkOVMuEV67ZMCX5qoAdiob%2BfWsuWK%3D0EyAKf3VGhjYdQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRDpkJcvWzYp1TH19j%3DWhe5XULn7eRDiPTXMPa1HR1NQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You should have released this before my talk last week, I could have
mentioned it :\

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Jun 3, 2014 at 6:15 PM, joergprante@gmail.com <joergprante@gmail.com

wrote:

Hi,

many of us want to start writing extensions for Elasticsearch.

Except submitting pull requests to the core code, one great advantage of
Elasticsearch is the plugin mechanism. Here, custom code can be hooked into
Elasticsearch, without having to ask for inclusion into the core code.
Nevertheless, plugin code can be published on Github and easily included
into a running ES instance by using the ES plugin command line tool.

Unfortunately, writing plugins is not so easy as it seems. There are many
plugins, some of them are very advanced, and finding a starting point for a
personal project could be quite hard.

Hence, for educational purposes, I wrote a tiny plugin, as a starting
point, to demonstrate how a plugin works.

The simple plugin is indeed very simple. It makes reuse of the standard
search action:

  • it defines a built-in query (a "match all" query)

  • it creates a custom action for it

  • the action is called from Java API

  • the result of the action (the search response of the "match all" query)
    is logged

The plugin code comes with a junit test. It is available at

GitHub - jprante/elasticsearch-simple-action-plugin: A simple action plugin for Elasticsearch

In the hope it is useful,

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt-16RTbRh376Kxg%3Di7DmjRhav-PYk_7qs1J5wu1W5a8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Don't forget your slides. :slight_smile:

http://code972.com/blog/2014/05/72-the-ultimate-guide-for-elasticsearch-plugins-video-slides

On Wed, Jun 4, 2014 at 2:30 PM, Itamar Syn-Hershko itamar@code972.com
wrote:

You should have released this before my talk last week, I could have
mentioned it :\

https://www.youtube.com/watch?v=FbAO2k57bdg

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Jun 3, 2014 at 6:15 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Hi,

many of us want to start writing extensions for Elasticsearch.

Except submitting pull requests to the core code, one great advantage of
Elasticsearch is the plugin mechanism. Here, custom code can be hooked into
Elasticsearch, without having to ask for inclusion into the core code.
Nevertheless, plugin code can be published on Github and easily included
into a running ES instance by using the ES plugin command line tool.

Unfortunately, writing plugins is not so easy as it seems. There are many
plugins, some of them are very advanced, and finding a starting point for a
personal project could be quite hard.

Hence, for educational purposes, I wrote a tiny plugin, as a starting
point, to demonstrate how a plugin works.

The simple plugin is indeed very simple. It makes reuse of the standard
search action:

  • it defines a built-in query (a "match all" query)

  • it creates a custom action for it

  • the action is called from Java API

  • the result of the action (the search response of the "match all" query)
    is logged

The plugin code comes with a junit test. It is available at

GitHub - jprante/elasticsearch-simple-action-plugin: A simple action plugin for Elasticsearch

In the hope it is useful,

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt-16RTbRh376Kxg%3Di7DmjRhav-PYk_7qs1J5wu1W5a8w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt-16RTbRh376Kxg%3Di7DmjRhav-PYk_7qs1J5wu1W5a8w%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD0Pvbr0eenPiVYm032ZycyTGWKxL7MH3KNL5EBAJZCzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

As said, it is true that scoring scripts (like the function score scripts o
the AbstractSearchScript) need to reside on data nodes. Accessing fields is
a low level operation in a script so it is not possible to install such a
boost plugin that uses scripting on a data-less node. You would have to
install it on all the data nodes which might become tedious (but it is
doable).

Another issue is that you use scripting in a java plugin. I conclude from
this, the search should work later over the HTTP API by executing a
standard function score query with the boost script name (is that true?)

Writing a plugin, in a pure java environment, you have much more degrees of
freedom to supersede the script functionality and use other code paths. For
example, you could reuse the resource watch service from ES (used for
watching script file changes) to reload the boost info (which is in your
binary files I assume). Then you could build the query internally using the
Java API as a custom score query action and execute it from your favorite
(data-less) node (or from two nodes, for better fault tolerance / load
balancing).

Optionally, you could expose a new endpoint to the ES REST API, for example
"_search_with_boost", which works like "_search", but makes use of the
boost info files.

For a more generic solution, it would be convenient to convert the boost
info into a JSON parameter file so this could be loaded by the standard ES
settings/config routines and by other languages, also for better reuse by
others in the ES community :slight_smile: An example plugin name could be "boost
control plugin"...

Jörg

On Wed, Jun 4, 2014 at 8:15 PM, virgil virgilxie@gmail.com wrote:

Yeah, but I would consider the nondata node is already doing the job. --
"These "non data" nodes are still part of the cluster, and they redirect
operations exactly to the node that holds the relevant data. The other
benefit is the fact that for scatter / gather based operations (such as
search), these nodes will take part of the processing since they will start
the scatter process, and perform the actual gather processing." I just
uploaded my native script code in GitHub - virgil0/TestPlugin.
It
works with the function score query. You can see that there are 3 bin file
I
need to load into memory. Thank you for reply.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057054.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401905723480-4057054.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGBMEEc6oC1%3DBX7gS41se13BExO_iKJtiGC6zrhmxJqxA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Great walkthrough :slight_smile: ... I only miss the mentioning of the standard
language plugins for scripting (groovy, js, etc.) And rivers are not
obsolete, the "pull" method from a singleton "river" node is just
discouraged.

Jörg

On Wed, Jun 4, 2014 at 11:30 PM, Itamar Syn-Hershko itamar@code972.com
wrote:

You should have released this before my talk last week, I could have
mentioned it :\

https://www.youtube.com/watch?v=FbAO2k57bdg

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Jun 3, 2014 at 6:15 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Hi,

many of us want to start writing extensions for Elasticsearch.

Except submitting pull requests to the core code, one great advantage of
Elasticsearch is the plugin mechanism. Here, custom code can be hooked into
Elasticsearch, without having to ask for inclusion into the core code.
Nevertheless, plugin code can be published on Github and easily included
into a running ES instance by using the ES plugin command line tool.

Unfortunately, writing plugins is not so easy as it seems. There are many
plugins, some of them are very advanced, and finding a starting point for a
personal project could be quite hard.

Hence, for educational purposes, I wrote a tiny plugin, as a starting
point, to demonstrate how a plugin works.

The simple plugin is indeed very simple. It makes reuse of the standard
search action:

  • it defines a built-in query (a "match all" query)

  • it creates a custom action for it

  • the action is called from Java API

  • the result of the action (the search response of the "match all" query)
    is logged

The plugin code comes with a junit test. It is available at

GitHub - jprante/elasticsearch-simple-action-plugin: A simple action plugin for Elasticsearch

In the hope it is useful,

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH-M6%2BZroAz8Reb3e2agW0vXKSavk%3D0hD_bq%2BBHtRYLhw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt-16RTbRh376Kxg%3Di7DmjRhav-PYk_7qs1J5wu1W5a8w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt-16RTbRh376Kxg%3Di7DmjRhav-PYk_7qs1J5wu1W5a8w%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEyF-hh3yEfeznk6p8v1tuzreDTYgcUVohuRW%3DpXRyO2w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Really good suggestion! Yeah, the search will work by executing a standard
function score query with the boost script name. Thank you!

2014-06-04 15:06 GMT-07:00 joergprante@gmail.com [via Elasticsearch Users] <
ml-node+s115913n4057071h55@n3.nabble.com>:

As said, it is true that scoring scripts (like the function score scripts
o the AbstractSearchScript) need to reside on data nodes. Accessing fields
is a low level operation in a script so it is not possible to install such
a boost plugin that uses scripting on a data-less node. You would have to
install it on all the data nodes which might become tedious (but it is
doable).

Another issue is that you use scripting in a java plugin. I conclude from
this, the search should work later over the HTTP API by executing a
standard function score query with the boost script name (is that true?)

Writing a plugin, in a pure java environment, you have much more degrees
of freedom to supersede the script functionality and use other code paths.
For example, you could reuse the resource watch service from ES (used for
watching script file changes) to reload the boost info (which is in your
binary files I assume). Then you could build the query internally using the
Java API as a custom score query action and execute it from your favorite
(data-less) node (or from two nodes, for better fault tolerance / load
balancing).

Optionally, you could expose a new endpoint to the ES REST API, for
example "_search_with_boost", which works like "_search", but makes use of
the boost info files.

For a more generic solution, it would be convenient to convert the boost
info into a JSON parameter file so this could be loaded by the standard ES
settings/config routines and by other languages, also for better reuse by
others in the ES community :slight_smile: An example plugin name could be "boost
control plugin"...

Jörg

On Wed, Jun 4, 2014 at 8:15 PM, virgil <[hidden email]
http://user/SendEmail.jtp?type=node&node=4057071&i=0> wrote:

Yeah, but I would consider the nondata node is already doing the job. --
"These "non data" nodes are still part of the cluster, and they redirect
operations exactly to the node that holds the relevant data. The other
benefit is the fact that for scatter / gather based operations (such as
search), these nodes will take part of the processing since they will
start
the scatter process, and perform the actual gather processing." I just
uploaded my native script code in GitHub - virgil0/TestPlugin.
It
works with the function score query. You can see that there are 3 bin
file I
need to load into memory. Thank you for reply.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057054.html

Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4057071&i=1.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401905723480-4057054.post%40n3.nabble.com
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4057071&i=2.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGBMEEc6oC1%3DBX7gS41se13BExO_iKJtiGC6zrhmxJqxA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGBMEEc6oC1%3DBX7gS41se13BExO_iKJtiGC6zrhmxJqxA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057071.html
To unsubscribe from [ANN] Elasticsearch Simple Action Plugin, click here
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4056971&code=dmlyZ2lseGllQGdtYWlsLmNvbXw0MDU2OTcxfC0xNjE2Mzk5OA==
.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml

--

Fei (Virgil) Xie

M.S. in Very Large Information System, Institue for Software Research

School of Computer Science, Carnegie Mellon University

fxie@andrew.cmu.edu

Alumni, Tsinghua University

One more hint, you see

org.elasticsearch.common.lucene.search.function.FieldValueFunction

This implements the ScoreFunction and fetches boost values from a
configured field in the doc, for use by the Java API for FunctionScoreQuery.

If you can write a custom ScoreFunction, you could implement another
ScoreFunction how to obtain the boost value, in dependency of other fields
in the doc. From a map with the product price info you could pass customer
name or product codes. In the custom action, at each query execution, the
map would have to be passed from the executing data-less node to the nodes
with the shards. Caution: the map should be not too big (a few hundred
entries maybe). This could be an overhead otherwise.

The alternative would be simpler, a script with groovy JsonSlurper code for
example, executing the boost function score search, of course on all the
data nodes. This may be preferable since the search is really faster when
the product price boost info is set up once from the local file system of
the data node (or once a day).

Jörg

On Thu, Jun 5, 2014 at 2:17 AM, virgil virgilxie@gmail.com wrote:

Really good suggestion! Yeah, the search will work by executing a
standard function score query with the boost script name. Thank you!

2014-06-04 15:06 GMT-07:00 [hidden email]
http://user/SendEmail.jtp?type=node&node=4057083&i=0 [via Elasticsearch
Users] <[hidden email]
http://user/SendEmail.jtp?type=node&node=4057083&i=1>:

As said, it is true that scoring scripts (like the function score scripts
o the AbstractSearchScript) need to reside on data nodes. Accessing fields
is a low level operation in a script so it is not possible to install such
a boost plugin that uses scripting on a data-less node. You would have to
install it on all the data nodes which might become tedious (but it is
doable).

Another issue is that you use scripting in a java plugin. I conclude from
this, the search should work later over the HTTP API by executing a
standard function score query with the boost script name (is that true?)

Writing a plugin, in a pure java environment, you have much more degrees
of freedom to supersede the script functionality and use other code paths.
For example, you could reuse the resource watch service from ES (used for
watching script file changes) to reload the boost info (which is in your
binary files I assume). Then you could build the query internally using the
Java API as a custom score query action and execute it from your favorite
(data-less) node (or from two nodes, for better fault tolerance / load
balancing).

Optionally, you could expose a new endpoint to the ES REST API, for
example "_search_with_boost", which works like "_search", but makes use of
the boost info files.

For a more generic solution, it would be convenient to convert the boost
info into a JSON parameter file so this could be loaded by the standard ES
settings/config routines and by other languages, also for better reuse by
others in the ES community :slight_smile: An example plugin name could be "boost
control plugin"...

Jörg

On Wed, Jun 4, 2014 at 8:15 PM, virgil <[hidden email]
http://user/SendEmail.jtp?type=node&node=4057071&i=0> wrote:

Yeah, but I would consider the nondata node is already doing the job. --
"These "non data" nodes are still part of the cluster, and they redirect
operations exactly to the node that holds the relevant data. The other
benefit is the fact that for scatter / gather based operations (such as
search), these nodes will take part of the processing since they will
start
the scatter process, and perform the actual gather processing." I just
uploaded my native script code in GitHub - virgil0/TestPlugin.
It
works with the function score query. You can see that there are 3 bin
file I
need to load into memory. Thank you for reply.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057054.html

Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4057071&i=1.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401905723480-4057054.post%40n3.nabble.com
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4057071&i=2.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGBMEEc6oC1%3DBX7gS41se13BExO_iKJtiGC6zrhmxJqxA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGBMEEc6oC1%3DBX7gS41se13BExO_iKJtiGC6zrhmxJqxA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057071.html
To unsubscribe from [ANN] Elasticsearch Simple Action Plugin, click here
.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml

--

Fei (Virgil) Xie

M.S. in Very Large Information System, Institue for Software Research

School of Computer Science, Carnegie Mellon University

[hidden email] http://user/SendEmail.jtp?type=node&node=4057083&i=2

Alumni, Tsinghua University


View this message in context: Re: [ANN] Elasticsearch Simple Action Plugin
http://elasticsearch-users.115913.n3.nabble.com/ANN-Elasticsearch-Simple-Action-Plugin-tp4056971p4057083.html

Sent from the Elasticsearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPRQSRfZTu0vFYOn0gbFnAUuaWQgB16vGGZspatGrL-ACt8jtA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAPRQSRfZTu0vFYOn0gbFnAUuaWQgB16vGGZspatGrL-ACt8jtA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFPsWWBqM2EH86ZCW_uyoFespTnfGgTdVgRVvH5612GTg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Good idea! One thing I am not quite clear is that writing a custom ScoreFunction, I will have to modify elasticsearch source code and compile it right? Or there is any other way to do it? Thank you.