Write a plugin to query and aggregate results from multiple shards

Hi,

I am looking through the sources, and I am not sure whether this is
possible. What I am looking to is the possibility to manipulate the
SearchRequest object when it reaches the SearchShards level.
Since I need to update the object with some value that is shard specific.

For this, I was checking the TransportBroadcastOperationAction which
actually allows to hit multiple shards and we can inject a SearchService.
However, in the response aggregation, we may have to write our own logic to
call SearchPhaseController::merge() or something. Not sure if this will be
a problem when the same code in ElasticSearch changes over releases.

There are also other classes like SearchServiceTransportAction and we can
also probably extend TransportSearchTypeAction like the other QAF, DFS_QAF,
QTF, DFS_QTF, etc. However, what I want to know is whether this is standard
practice and should be done this way? Or is there any other plugin that
allows me to do this?

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e1f52da2-bb05-4005-bf88-8031f5440225%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

See the TransportSearchAction, in the doExecute() method, the SearchRequest
is dispatched to several transport actions of the search types.

Assuming you write your own custom action: the shard level request
is ShardSearchRequest. It is easier to add information to SearchRequest,
pass it down, and extract relevant parts from the SearchRequest later. See
ShardSearchRequest constructor for repacking the parameters and delegate it
to a shard.

Maybe it is possible to add info to extrasource.

Jörg

On Mon, Sep 15, 2014 at 8:57 AM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasticsearch@googlegroups.com wrote:

Hi,

I am looking through the sources, and I am not sure whether this is
possible. What I am looking to is the possibility to manipulate the
SearchRequest object when it reaches the SearchShards level.
Since I need to update the object with some value that is shard specific.

For this, I was checking the TransportBroadcastOperationAction which
actually allows to hit multiple shards and we can inject a SearchService.
However, in the response aggregation, we may have to write our own logic to
call SearchPhaseController::merge() or something. Not sure if this will be
a problem when the same code in Elasticsearch changes over releases.

There are also other classes like SearchServiceTransportAction and we can
also probably extend TransportSearchTypeAction like the other QAF, DFS_QAF,
QTF, DFS_QTF, etc. However, what I want to know is whether this is standard
practice and should be done this way? Or is there any other plugin that
allows me to do this?

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e1f52da2-bb05-4005-bf88-8031f5440225%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e1f52da2-bb05-4005-bf88-8031f5440225%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGMr6K81CSWid1%2BoPP_fZ6i1_4J%2B0uNXHuqjCkVB5MXWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Appreciate the response as always.

Please bear with my technical understanding of ES :slight_smile:

In the TransportSearchAction, the doExecute() delegates to one of the six
different search types. It is inside the execute methods of those
individual six actions, that they will look at the shards. Correct me if I
am wrong. Even if I modify the SearchRequest in this case, I will not be
able to identify the shardId that is operating on this request object
before doing the modification.

Which class/module should I use to write my custom action? If I write a
seventh SearchAction type extending TransportSearchTypeAction, then in the
executeQuery()/executeFetch() first/second phases, I will probably get a
hold of the shardsearchrequest object. Are you saying that I should modify
it there? Please let me know. My concern is that if I write a new
searchaction custom plugin like this, I will not be able to replicate ES
functionality of the existing six search types and I will also be referring
to some Internal classes of ES. Will that be okay?

Maybe I am missing something. But I need to modify the request object with
the insertion of the shardID identifier when it is known. The reason I need
to insert the shardID is so that the Filter Parser Plugin can then take a
look at the SearchRequest (or ShardSearchRequest at that time) and do
something specific for that shard.

Thanks,
Sandeep

On Mon, Sep 15, 2014 at 1:08 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

See the TransportSearchAction, in the doExecute() method, the
SearchRequest is dispatched to several transport actions of the search
types.

Assuming you write your own custom action: the shard level request
is ShardSearchRequest. It is easier to add information to SearchRequest,
pass it down, and extract relevant parts from the SearchRequest later. See
ShardSearchRequest constructor for repacking the parameters and delegate it
to a shard.

Maybe it is possible to add info to extrasource.

Jörg

On Mon, Sep 15, 2014 at 8:57 AM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasticsearch@googlegroups.com wrote:

Hi,

I am looking through the sources, and I am not sure whether this is
possible. What I am looking to is the possibility to manipulate the
SearchRequest object when it reaches the SearchShards level.
Since I need to update the object with some value that is shard specific.

For this, I was checking the TransportBroadcastOperationAction which
actually allows to hit multiple shards and we can inject a SearchService.
However, in the response aggregation, we may have to write our own logic to
call SearchPhaseController::merge() or something. Not sure if this will be
a problem when the same code in Elasticsearch changes over releases.

There are also other classes like SearchServiceTransportAction and we can
also probably extend TransportSearchTypeAction like the other QAF, DFS_QAF,
QTF, DFS_QTF, etc. However, what I want to know is whether this is standard
practice and should be done this way? Or is there any other plugin that
allows me to do this?

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e1f52da2-bb05-4005-bf88-8031f5440225%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e1f52da2-bb05-4005-bf88-8031f5440225%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BS64wfqrHNM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGMr6K81CSWid1%2BoPP_fZ6i1_4J%2B0uNXHuqjCkVB5MXWw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGMr6K81CSWid1%2BoPP_fZ6i1_4J%2B0uNXHuqjCkVB5MXWw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKnM90YKM68sc-_mB_yGhP-93N%2BE0fipfzWyX8jDZwX-phPZsA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

If you want to use the filter parser plugin - I think you mean
https://github.com/lmenezes/elasticsearch-terms-fetch-filter-plugin - then
why don't you simply extend the plugin and build a new plugin from that
codebase?

From what I understand is you somehow want to modify the search action core
code but that is not the best method how to extend Elasticsearch on the
query side. In a plugin you can add new queries and new query filters very
easily, they can be registered at plugin start time, without tampering with
the low level core code.

Jörg

On Tue, Sep 16, 2014 at 8:25 AM, Sandeep Ramesh Khanzode <
k.sandeep.r@gmail.com> wrote:

Appreciate the response as always.

Please bear with my technical understanding of ES :slight_smile:

In the TransportSearchAction, the doExecute() delegates to one of the six
different search types. It is inside the execute methods of those
individual six actions, that they will look at the shards. Correct me if I
am wrong. Even if I modify the SearchRequest in this case, I will not be
able to identify the shardId that is operating on this request object
before doing the modification.

Which class/module should I use to write my custom action? If I write a
seventh SearchAction type extending TransportSearchTypeAction, then in the
executeQuery()/executeFetch() first/second phases, I will probably get a
hold of the shardsearchrequest object. Are you saying that I should modify
it there? Please let me know. My concern is that if I write a new
searchaction custom plugin like this, I will not be able to replicate ES
functionality of the existing six search types and I will also be referring
to some Internal classes of ES. Will that be okay?

Maybe I am missing something. But I need to modify the request object with
the insertion of the shardID identifier when it is known. The reason I need
to insert the shardID is so that the Filter Parser Plugin can then take a
look at the SearchRequest (or ShardSearchRequest at that time) and do
something specific for that shard.

Thanks,
Sandeep

On Mon, Sep 15, 2014 at 1:08 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

See the TransportSearchAction, in the doExecute() method, the
SearchRequest is dispatched to several transport actions of the search
types.

Assuming you write your own custom action: the shard level request
is ShardSearchRequest. It is easier to add information to SearchRequest,
pass it down, and extract relevant parts from the SearchRequest later. See
ShardSearchRequest constructor for repacking the parameters and delegate it
to a shard.

Maybe it is possible to add info to extrasource.

Jörg

On Mon, Sep 15, 2014 at 8:57 AM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasticsearch@googlegroups.com wrote:

Hi,

I am looking through the sources, and I am not sure whether this is
possible. What I am looking to is the possibility to manipulate the
SearchRequest object when it reaches the SearchShards level.
Since I need to update the object with some value that is shard specific.

For this, I was checking the TransportBroadcastOperationAction which
actually allows to hit multiple shards and we can inject a SearchService.
However, in the response aggregation, we may have to write our own logic to
call SearchPhaseController::merge() or something. Not sure if this will be
a problem when the same code in Elasticsearch changes over releases.

There are also other classes like SearchServiceTransportAction and we
can also probably extend TransportSearchTypeAction like the other QAF,
DFS_QAF, QTF, DFS_QTF, etc. However, what I want to know is whether this is
standard practice and should be done this way? Or is there any other plugin
that allows me to do this?

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e1f52da2-bb05-4005-bf88-8031f5440225%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e1f52da2-bb05-4005-bf88-8031f5440225%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BS64wfqrHNM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGMr6K81CSWid1%2BoPP_fZ6i1_4J%2B0uNXHuqjCkVB5MXWw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGMr6K81CSWid1%2BoPP_fZ6i1_4J%2B0uNXHuqjCkVB5MXWw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKnM90YKM68sc-_mB_yGhP-93N%2BE0fipfzWyX8jDZwX-phPZsA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKnM90YKM68sc-_mB_yGhP-93N%2BE0fipfzWyX8jDZwX-phPZsA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFfsvA-FpOUuTGOQLkv_2PxbYny6iN_n%2Bob3v79E-%2B_VQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.