How to specify execution order of filter and query?


(Ivan Hall) #1

I am brand new to ElasticSearch and have a few fundamental questions
regarding queries and filters.

  1. How does one control ordering of filter vs query? Is it possible to
    filter first, then perform a query on those results?

With a filter queryhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html the
documentation reads:

A query that applies a filter to the results of another query.

How I interpret that statement is that the query will run first then the
filter will be applied. Can this behavior be adjusted?

Also, I can execute a search with an independent filter and query which
produces the same results as a filtered query. It appears the timing is
about the same as well.

  1. Can I simply place the filter first within the JSON object then the
    query to control whether the filter is run before the query?

Ultimately, I just really want to know the behavior for when filters are
run before queries and if there is a possibility to control the execution
precedence.

Thank you!

--
Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7c36c760-3b47-41a8-89c9-1da3a22ac153%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #2

Hi Ivan (awesome name BTW),

Read my recent reply about filters:
https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/xZGnyTI6lmo

Basically the filter in a filtered query is executed before the query. IMHO
that documentation is misleading. Use post filters (simply 'filter' before
1.0) to execute the filters after the query.

--
Ivan

On Tue, Feb 18, 2014 at 12:42 PM, Ivan Hall ivan.hall@coyote.com wrote:

I am brand new to ElasticSearch and have a few fundamental questions
regarding queries and filters.

  1. How does one control ordering of filter vs query? Is it possible to
    filter first, then perform a query on those results?

With a filter queryhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html the
documentation reads:

A query that applies a filter to the results of another query.

How I interpret that statement is that the query will run first then the
filter will be applied. Can this behavior be adjusted?

Also, I can execute a search with an independent filter and query which
produces the same results as a filtered query. It appears the timing is
about the same as well.

  1. Can I simply place the filter first within the JSON object then the
    query to control whether the filter is run before the query?

Ultimately, I just really want to know the behavior for when filters are
run before queries and if there is a possibility to control the execution
precedence.

Thank you!

Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).
http://www.coyote.com/disclaimer/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7c36c760-3b47-41a8-89c9-1da3a22ac153%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDiojU_Rza%3D882_4M8HV4LkOm0pBsr2XQLORyu-0eVVoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #3

The documentation suddenly made me doubt if we I knew was wrong. :slight_smile:

The default strategy for Elasticsearch's filtered query is a custom random
access one. For each document, it will first check the docSet before
executing the query.

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/query/FilteredQueryParser.java#L64

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/common/lucene/search/XFilteredQuery.java#L188-L200

The strategy is configurable, so it would be nice to be added to the
documentation. The original description of the filtered query comes
directly from Lucene:
http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/FilteredQuery.html

Cheers,

Ivan

On Tue, Feb 18, 2014 at 1:02 PM, Ivan Brusic ivan@brusic.com wrote:

Hi Ivan (awesome name BTW),

Read my recent reply about filters:
https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/xZGnyTI6lmo

Basically the filter in a filtered query is executed before the query.
IMHO that documentation is misleading. Use post filters (simply 'filter'
before 1.0) to execute the filters after the query.

--
Ivan

On Tue, Feb 18, 2014 at 12:42 PM, Ivan Hall ivan.hall@coyote.com wrote:

I am brand new to ElasticSearch and have a few fundamental questions
regarding queries and filters.

  1. How does one control ordering of filter vs query? Is it possible
    to filter first, then perform a query on those results?

With a filter queryhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html the
documentation reads:

A query that applies a filter to the results of another query.

How I interpret that statement is that the query will run first then the
filter will be applied. Can this behavior be adjusted?

Also, I can execute a search with an independent filter and query which
produces the same results as a filtered query. It appears the timing is
about the same as well.

  1. Can I simply place the filter first within the JSON object then the
    query to control whether the filter is run before the query?

Ultimately, I just really want to know the behavior for when filters are
run before queries and if there is a possibility to control the execution
precedence.

Thank you!

Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).
http://www.coyote.com/disclaimer/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7c36c760-3b47-41a8-89c9-1da3a22ac153%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD3LZDgjF54VQVp9aEf1imUy%3Dr3mO%2BR68PHA2TgvBzwuA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Hall) #4

I think I screwed up the reply... sorry if you got multiple replies Ivan.

What I said in those replies is:

Looks like from the github links that a 'filter first' strategy has yet to
be implemented.

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/common/lucene/search/XFilteredQuery.java#L183

I guess I'll stick with the default strategy for now due to inexperience
and lack of experimentation. I assume it should provide generally good
performance.

As I learn more I might try to mess with strategy and hopefully a filter
first strategy gets implemented. I would assume this would greatly
increase string based queries by filtering to a subset of documents first!

On Tuesday, February 18, 2014 3:19:50 PM UTC-6, Ivan Brusic wrote:

The documentation suddenly made me doubt if we I knew was wrong. :slight_smile:

The default strategy for Elasticsearch's filtered query is a custom random
access one. For each document, it will first check the docSet before
executing the query.

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/query/FilteredQueryParser.java#L64

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/common/lucene/search/XFilteredQuery.java#L188-L200

The strategy is configurable, so it would be nice to be added to the
documentation. The original description of the filtered query comes
directly from Lucene:
http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/FilteredQuery.html

Cheers,

Ivan

On Tue, Feb 18, 2014 at 1:02 PM, Ivan Brusic <iv...@brusic.com<javascript:>

wrote:

Hi Ivan (awesome name BTW),

Read my recent reply about filters:
https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/xZGnyTI6lmo

Basically the filter in a filtered query is executed before the query.
IMHO that documentation is misleading. Use post filters (simply 'filter'
before 1.0) to execute the filters after the query.

--
Ivan

On Tue, Feb 18, 2014 at 12:42 PM, Ivan Hall <ivan...@coyote.com<javascript:>

wrote:

I am brand new to ElasticSearch and have a few fundamental questions
regarding queries and filters.

  1. How does one control ordering of filter vs query? Is it possible
    to filter first, then perform a query on those results?

With a filter queryhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html the
documentation reads:

A query that applies a filter to the results of another query.

How I interpret that statement is that the query will run first then the
filter will be applied. Can this behavior be adjusted?

Also, I can execute a search with an independent filter and query which
produces the same results as a filtered query. It appears the timing is
about the same as well.

  1. Can I simply place the filter first within the JSON object then the
    query to control whether the filter is run before the query?

Ultimately, I just really want to know the behavior for when filters are
run before queries and if there is a possibility to control the execution
precedence.

Thank you!

Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).
http://www.coyote.com/disclaimer/

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7c36c760-3b47-41a8-89c9-1da3a22ac153%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bbc991c6-b6f8-43c0-82be-906ced299e37%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Hall) #5

Also, I had another question regarding the difference between these two
queries:

{
"query": { ... },
"filter": { ... }
}

{
"filtered": {
"query": { ... },
"filter": { ... }
}
}

When I try these they both produce the same results. What is the technical
difference between these two?

On Wednesday, February 19, 2014 9:31:11 AM UTC-6, Ivan Hall wrote:

I think I screwed up the reply... sorry if you got multiple replies Ivan.

What I said in those replies is:

Looks like from the github links that a 'filter first' strategy has yet to
be implemented.

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/common/lucene/search/XFilteredQuery.java#L183

I guess I'll stick with the default strategy for now due to inexperience
and lack of experimentation. I assume it should provide generally good
performance.

As I learn more I might try to mess with strategy and hopefully a filter
first strategy gets implemented. I would assume this would greatly
increase string based queries by filtering to a subset of documents first!

On Tuesday, February 18, 2014 3:19:50 PM UTC-6, Ivan Brusic wrote:

The documentation suddenly made me doubt if we I knew was wrong. :slight_smile:

The default strategy for Elasticsearch's filtered query is a custom
random access one. For each document, it will first check the docSet before
executing the query.

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/query/FilteredQueryParser.java#L64

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/common/lucene/search/XFilteredQuery.java#L188-L200

The strategy is configurable, so it would be nice to be added to the
documentation. The original description of the filtered query comes
directly from Lucene:
http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/FilteredQuery.html

Cheers,

Ivan

On Tue, Feb 18, 2014 at 1:02 PM, Ivan Brusic iv...@brusic.com wrote:

Hi Ivan (awesome name BTW),

Read my recent reply about filters:
https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/xZGnyTI6lmo

Basically the filter in a filtered query is executed before the query.
IMHO that documentation is misleading. Use post filters (simply 'filter'
before 1.0) to execute the filters after the query.

--
Ivan

On Tue, Feb 18, 2014 at 12:42 PM, Ivan Hall ivan...@coyote.com wrote:

I am brand new to ElasticSearch and have a few fundamental questions
regarding queries and filters.

  1. How does one control ordering of filter vs query? Is it possible
    to filter first, then perform a query on those results?

With a filter queryhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html the
documentation reads:

A query that applies a filter to the results of another query.

How I interpret that statement is that the query will run first then
the filter will be applied. Can this behavior be adjusted?

Also, I can execute a search with an independent filter and query which
produces the same results as a filtered query. It appears the timing is
about the same as well.

  1. Can I simply place the filter first within the JSON object then
    the query to control whether the filter is run before the query?

Ultimately, I just really want to know the behavior for when filters
are run before queries and if there is a possibility to control the
execution precedence.

Thank you!

Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).
http://www.coyote.com/disclaimer/

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7c36c760-3b47-41a8-89c9-1da3a22ac153%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).
http://www.coyote.com/disclaimer/

--
Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fa206889-d384-4daa-a17f-3077c7dba5d7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #6

Other than what Ivan Brusic already explained, there is also a difference
in how the facets/aggregations are computed. In your first example, the
aggregations are computed based on the results that match only the query
part. In your second example, the aggregations are computed based on the
results from both the query + filter.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2d4c51d5-bb15-4e88-888b-c5b16dc108ec%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #7

If you trace the code more, you would see that it eventually uses
Lucene's LEAP_FROG_FILTER_FIRST_STRATEGY which according to the docs:
"Note: This strategy uses the filter to lead the iteration.". This strategy
is used when the default threshold (-1) is set and the DocIdSet (basically
bits) is "fast" (aka nothing expense like geo).

Binh answered your other question. Basically the outer filter is now called
post_filter to remove the ambiguity.

Cheers,

Ivan

On Wed, Feb 19, 2014 at 7:31 AM, Ivan Hall ivan.hall@coyote.com wrote:

I think I screwed up the reply... sorry if you got multiple replies Ivan.

What I said in those replies is:

Looks like from the github links that a 'filter first' strategy has yet to
be implemented.

https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/common/lucene/search/XFilteredQuery.java#L183

I guess I'll stick with the default strategy for now due to inexperience
and lack of experimentation. I assume it should provide generally good
performance.

As I learn more I might try to mess with strategy and hopefully a filter
first strategy gets implemented. I would assume this would greatly
increase string based queries by filtering to a subset of documents first!

On Tuesday, February 18, 2014 3:19:50 PM UTC-6, Ivan Brusic wrote:

The documentation suddenly made me doubt if we I knew was wrong. :slight_smile:

The default strategy for Elasticsearch's filtered query is a custom
random access one. For each document, it will first check the docSet before
executing the query.

https://github.com/elasticsearch/elasticsearch/
blob/master/src/main/java/org/elasticsearch/index/query/
FilteredQueryParser.java#L64

https://github.com/elasticsearch/elasticsearch/
blob/master/src/main/java/org/elasticsearch/common/lucene/
search/XFilteredQuery.java#L188-L200

The strategy is configurable, so it would be nice to be added to the
documentation. The original description of the filtered query comes
directly from Lucene: http://lucene.apache.org/core/4_6_0/core/org/
apache/lucene/search/FilteredQuery.html

Cheers,

Ivan

On Tue, Feb 18, 2014 at 1:02 PM, Ivan Brusic iv...@brusic.com wrote:

Hi Ivan (awesome name BTW),

Read my recent reply about filters: https://groups.
google.com/forum/?fromgroups#!topic/elasticsearch/xZGnyTI6lmo

Basically the filter in a filtered query is executed before the query.
IMHO that documentation is misleading. Use post filters (simply 'filter'
before 1.0) to execute the filters after the query.

--
Ivan

On Tue, Feb 18, 2014 at 12:42 PM, Ivan Hall ivan...@coyote.com wrote:

I am brand new to ElasticSearch and have a few fundamental questions
regarding queries and filters.

  1. How does one control ordering of filter vs query? Is it possible
    to filter first, then perform a query on those results?

With a filter queryhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html the
documentation reads:

A query that applies a filter to the results of another query.

How I interpret that statement is that the query will run first then
the filter will be applied. Can this behavior be adjusted?

Also, I can execute a search with an independent filter and query which
produces the same results as a filtered query. It appears the timing is
about the same as well.

  1. Can I simply place the filter first within the JSON object then
    the query to control whether the filter is run before the query?

Ultimately, I just really want to know the behavior for when filters
are run before queries and if there is a possibility to control the
execution precedence.

Thank you!

Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).
http://www.coyote.com/disclaimer/

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/7c36c760-3b47-41a8-89c9-1da3a22ac153%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Coyote Logistics, LLC is a licensed Property Broker (MC# 561135-B).
http://www.coyote.com/disclaimer/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bbc991c6-b6f8-43c0-82be-906ced299e37%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBuCjxUt0ShT%2BkvreZocTMnzhNEDwYMUnaWyeenGBc0Fw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #8