Shard Aware Routing of Query

Hi,

I have a large-ish data set that could grow beyond a 100M. I have queries
to be executed for this index. I would like to have query filter data local
to a shard being sent to that shard, so that I spend less time creating a
filter and even lesser time matching it for a shard. If I do not do this, I
will have to create a filter that will have to contain data for all 100M
documents across all shards, and every shard will have to match documents
against that filter for all documents that are not even belonging to that
shard.

I plan to write a query filter using the IndexQueryParserModule plugin.

However, in the QueryParserContent, I can only see the Index object which
contains some details of the index, like the name, etc. I could not see any
other details like the specific shard where this query will be executed.

Is there a way to write shard aware query and filter parsers?

If not, can I create as many indices as I want to create shards (since I
already get the index name), and effectively create one shard per index (+1
for replica) and treat every index as if it were a shard? Is that too heavy
or just non-compliant to the philosophy of ES?

Please let me know,

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You can create single shard index, or you can use routing to select shards.

See SearchRequestBuilder for setRouting()

Jörg

On Tue, Jul 15, 2014 at 10:25 AM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasticsearch@googlegroups.com wrote:

Hi,

I have a large-ish data set that could grow beyond a 100M. I have queries
to be executed for this index. I would like to have query filter data local
to a shard being sent to that shard, so that I spend less time creating a
filter and even lesser time matching it for a shard. If I do not do this, I
will have to create a filter that will have to contain data for all 100M
documents across all shards, and every shard will have to match documents
against that filter for all documents that are not even belonging to that
shard.

I plan to write a query filter using the IndexQueryParserModule plugin.

However, in the QueryParserContent, I can only see the Index object which
contains some details of the index, like the name, etc. I could not see any
other details like the specific shard where this query will be executed.

Is there a way to write shard aware query and filter parsers?

If not, can I create as many indices as I want to create shards (since I
already get the index name), and effectively create one shard per index (+1
for replica) and treat every index as if it were a shard? Is that too heavy
or just non-compliant to the philosophy of ES?

Please let me know,

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEQb8zM-SsHpzxKnM%3D%2BYgQAewHgKBF-EAapFmyWYFoecA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

Thanks, I will take a look at the SearchRequestBuilder class.
However, it does seem like a Query API invoke time decision for the user to
decide the routing by setting the appropriate values in the SRB.

However, I want the custom FilterParser that I added as a processor in the
IndexQueryParserModule plugin to be aware of the shard on which it will
execute. This is because then I can set filter values for only the
documents that exist on that shard. I checked the QueryParserContext, and
there is no information in that regard.

If I use the SRB at client side, and specify the shards and the filters for
those shards, then I will have to aggregate the results myself which is not
preferable.

Can you please give me some example of how this can be achieved?

Thanks,
Sandeep

On Tuesday, 15 July 2014 15:18:47 UTC+5:30, Jörg Prante wrote:

You can create single shard index, or you can use routing to select shards.

See SearchRequestBuilder for setRouting()

Jörg

On Tue, Jul 15, 2014 at 10:25 AM, 'Sandeep Ramesh Khanzode' via
elasticsearch <elasti...@googlegroups.com <javascript:>> wrote:

Hi,

I have a large-ish data set that could grow beyond a 100M. I have queries
to be executed for this index. I would like to have query filter data local
to a shard being sent to that shard, so that I spend less time creating a
filter and even lesser time matching it for a shard. If I do not do this, I
will have to create a filter that will have to contain data for all 100M
documents across all shards, and every shard will have to match documents
against that filter for all documents that are not even belonging to that
shard.

I plan to write a query filter using the IndexQueryParserModule plugin.

However, in the QueryParserContent, I can only see the Index object which
contains some details of the index, like the name, etc. I could not see any
other details like the specific shard where this query will be executed.

Is there a way to write shard aware query and filter parsers?

If not, can I create as many indices as I want to create shards (since I
already get the index name), and effectively create one shard per index (+1
for replica) and treat every index as if it were a shard? Is that too heavy
or just non-compliant to the philosophy of ES?

Please let me know,

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0c736a73-1a7c-4a3d-aa6b-9c9860d78f79%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Filters are always parsed as part of a query on shard level. If you examine
QueryParserContext from within executing FilterParser, the decision of
which shard to execute on has already been made.

Jörg

On Tue, Jul 15, 2014 at 1:09 PM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasticsearch@googlegroups.com wrote:

Hi,

Thanks, I will take a look at the SearchRequestBuilder class.
However, it does seem like a Query API invoke time decision for the user
to decide the routing by setting the appropriate values in the SRB.

However, I want the custom FilterParser that I added as a processor in the
IndexQueryParserModule plugin to be aware of the shard on which it will
execute. This is because then I can set filter values for only the
documents that exist on that shard. I checked the QueryParserContext, and
there is no information in that regard.

If I use the SRB at client side, and specify the shards and the filters
for those shards, then I will have to aggregate the results myself which is
not preferable.

Can you please give me some example of how this can be achieved?

Thanks,
Sandeep

On Tuesday, 15 July 2014 15:18:47 UTC+5:30, Jörg Prante wrote:

You can create single shard index, or you can use routing to select
shards.

See SearchRequestBuilder for setRouting()

Jörg

On Tue, Jul 15, 2014 at 10:25 AM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasti...@googlegroups.com wrote:

Hi,

I have a large-ish data set that could grow beyond a 100M. I have
queries to be executed for this index. I would like to have query filter
data local to a shard being sent to that shard, so that I spend less time
creating a filter and even lesser time matching it for a shard. If I do not
do this, I will have to create a filter that will have to contain data for
all 100M documents across all shards, and every shard will have to match
documents against that filter for all documents that are not even belonging
to that shard.

I plan to write a query filter using the IndexQueryParserModule plugin.

However, in the QueryParserContent, I can only see the Index object
which contains some details of the index, like the name, etc. I could not
see any other details like the specific shard where this query will be
executed.

Is there a way to write shard aware query and filter parsers?

If not, can I create as many indices as I want to create shards (since I
already get the index name), and effectively create one shard per index (+1
for replica) and treat every index as if it were a shard? Is that too heavy
or just non-compliant to the philosophy of ES?

Please let me know,

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0c736a73-1a7c-4a3d-aa6b-9c9860d78f79%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0c736a73-1a7c-4a3d-aa6b-9c9860d78f79%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoERweGqhXnH6g%3DXoY%3DRgLUkqXt%2BEZHO3tBYk%2BxWyBR2Ww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Jorg,

I have been trying to examine the QueryParserContext. However, I am only
able to locate the Index name in this object, but there is no reference of
any shard level information.

I understand that you are trying to say that the shard decision has already
been made, (so there is no need to state that information here again
possibly) so that information is not available with the QueryParser then,
and that is probably by design?

Thanks,
Sandeep

On Tuesday, 15 July 2014 17:01:56 UTC+5:30, Jörg Prante wrote:

Filters are always parsed as part of a query on shard level. If you
examine QueryParserContext from within executing FilterParser, the decision
of which shard to execute on has already been made.

Jörg

On Tue, Jul 15, 2014 at 1:09 PM, 'Sandeep Ramesh Khanzode' via
elasticsearch <elasti...@googlegroups.com <javascript:>> wrote:

Hi,

Thanks, I will take a look at the SearchRequestBuilder class.
However, it does seem like a Query API invoke time decision for the user
to decide the routing by setting the appropriate values in the SRB.

However, I want the custom FilterParser that I added as a processor in
the IndexQueryParserModule plugin to be aware of the shard on which it will
execute. This is because then I can set filter values for only the
documents that exist on that shard. I checked the QueryParserContext, and
there is no information in that regard.

If I use the SRB at client side, and specify the shards and the filters
for those shards, then I will have to aggregate the results myself which is
not preferable.

Can you please give me some example of how this can be achieved?

Thanks,
Sandeep

On Tuesday, 15 July 2014 15:18:47 UTC+5:30, Jörg Prante wrote:

You can create single shard index, or you can use routing to select
shards.

See SearchRequestBuilder for setRouting()

Jörg

On Tue, Jul 15, 2014 at 10:25 AM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasti...@googlegroups.com wrote:

Hi,

I have a large-ish data set that could grow beyond a 100M. I have
queries to be executed for this index. I would like to have query filter
data local to a shard being sent to that shard, so that I spend less time
creating a filter and even lesser time matching it for a shard. If I do not
do this, I will have to create a filter that will have to contain data for
all 100M documents across all shards, and every shard will have to match
documents against that filter for all documents that are not even belonging
to that shard.

I plan to write a query filter using the IndexQueryParserModule plugin.

However, in the QueryParserContent, I can only see the Index object
which contains some details of the index, like the name, etc. I could not
see any other details like the specific shard where this query will be
executed.

Is there a way to write shard aware query and filter parsers?

If not, can I create as many indices as I want to create shards (since
I already get the index name), and effectively create one shard per index
(+1 for replica) and treat every index as if it were a shard? Is that too
heavy or just non-compliant to the philosophy of ES?

Please let me know,

Thanks,
Sandeep

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e8c09c18-4192-41ae-86e9-5d67723e5558%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0c736a73-1a7c-4a3d-aa6b-9c9860d78f79%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0c736a73-1a7c-4a3d-aa6b-9c9860d78f79%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d8c6eb14-962f-472a-86b6-97d0f12538af%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.