Bool query search order

When using a should clause of a bool query - what is the order in which the
parts of the should clause are executed? Can I count on them being executed
in the order I provide? (That is - if the 'should' contains A or B or C,
will it be executed in that order?)

--

I would assume the order is undefined. Internally boolean queries use some
complex logic to move through the various clauses in the most optimal way.
The 'order' can vary.

On Wednesday, November 7, 2012 6:52:35 PM UTC+11, Rotem wrote:

When using a should clause of a bool query - what is the order in which
the parts of the should clause are executed? Can I count on them being
executed in the order I provide? (That is - if the 'should' contains A or B
or C, will it be executed in that order?)

--

On Wed, 2012-11-07 at 02:37 -0800, Chris Male wrote:

I would assume the order is undefined. Internally boolean queries use
some complex logic to move through the various clauses in the most
optimal way. The 'order' can vary.

On Wednesday, November 7, 2012 6:52:35 PM UTC+11, Rotem wrote:
When using a should clause of a bool query - what is the order
in which the parts of the should clause are executed? Can I
count on them being executed in the order I provide? (That is
- if the 'should' contains A or B or C, will it be executed in
that order?)

That said (and not what you asked) but the order is important for and|or
filters, which short circuit.

clint

--

So, most restrictive filters at first?

Le mercredi 7 novembre 2012 11:40:33 UTC+1, Clinton Gormley a écrit :

On Wed, 2012-11-07 at 02:37 -0800, Chris Male wrote:

I would assume the order is undefined. Internally boolean queries use
some complex logic to move through the various clauses in the most
optimal way. The 'order' can vary.

On Wednesday, November 7, 2012 6:52:35 PM UTC+11, Rotem wrote:
When using a should clause of a bool query - what is the order
in which the parts of the should clause are executed? Can I
count on them being executed in the order I provide? (That is
- if the 'should' contains A or B or C, will it be executed in
that order?)

That said (and not what you asked) but the order is important for and|or
filters, which short circuit.

clint

--

But if I want to do it in a query, not a filter, is there a way to ensure
the order in which "OR" parts are performed?

On Wed, Nov 7, 2012 at 12:40 PM, Clinton Gormley clint@traveljury.comwrote:

On Wed, 2012-11-07 at 02:37 -0800, Chris Male wrote:

I would assume the order is undefined. Internally boolean queries use
some complex logic to move through the various clauses in the most
optimal way. The 'order' can vary.

On Wednesday, November 7, 2012 6:52:35 PM UTC+11, Rotem wrote:
When using a should clause of a bool query - what is the order
in which the parts of the should clause are executed? Can I
count on them being executed in the order I provide? (That is
- if the 'should' contains A or B or C, will it be executed in
that order?)

That said (and not what you asked) but the order is important for and|or
filters, which short circuit.

clint

--

--

On Wed, 2012-11-07 at 02:42 -0800, Tanguy wrote:

So, most restrictive filters at first?

For "and" yes. Least restrictive first for "or"

Le mercredi 7 novembre 2012 11:40:33 UTC+1, Clinton Gormley a écrit :
On Wed, 2012-11-07 at 02:37 -0800, Chris Male wrote:
> I would assume the order is undefined. Internally boolean
queries use
> some complex logic to move through the various clauses in
the most
> optimal way. The 'order' can vary.
>
> On Wednesday, November 7, 2012 6:52:35 PM UTC+11, Rotem
wrote:
> When using a should clause of a bool query - what is
the order
> in which the parts of the should clause are
executed? Can I
> count on them being executed in the order I provide?
(That is
> - if the 'should' contains A or B or C, will it be
executed in
> that order?)

    That said (and not what you asked) but the order is important
    for and|or 
    filters, which short circuit. 
    
    clint 

--

--

Thanks Clinton :slight_smile:

Le mercredi 7 novembre 2012 12:13:04 UTC+1, Clinton Gormley a écrit :

On Wed, 2012-11-07 at 02:42 -0800, Tanguy wrote:

So, most restrictive filters at first?

For "and" yes. Least restrictive first for "or"

Le mercredi 7 novembre 2012 11:40:33 UTC+1, Clinton Gormley a écrit :
On Wed, 2012-11-07 at 02:37 -0800, Chris Male wrote:
> I would assume the order is undefined. Internally boolean
queries use
> some complex logic to move through the various clauses in
the most
> optimal way. The 'order' can vary.
>
> On Wednesday, November 7, 2012 6:52:35 PM UTC+11, Rotem
wrote:
> When using a should clause of a bool query - what is
the order
> in which the parts of the should clause are
executed? Can I
> count on them being executed in the order I provide?
(That is
> - if the 'should' contains A or B or C, will it be
executed in
> that order?)

    That said (and not what you asked) but the order is important 
    for and|or 
    filters, which short circuit. 
    
    clint 

--

--

On 11/7/2012 3:12 AM, Clinton Gormley wrote:

On Wed, 2012-11-07 at 02:42 -0800, Tanguy wrote:

So, most restrictive filters at first?
For "and" yes. Least restrictive first for "or"

How would you (or the ES evaulation) define less/more restrictive? Are
you suggesting that ES
picks which one to do 1st or that that I should put what I think is the
most restrictive as the 1st expression of a "and" filter.

How would the evaluation order effect a scope defined in a child or
nested query? (Igor Motov and I where just discussing the issue of
evalution order in another thread, but didn't know the answer, so
couldn't work out why I was getting unexpected results from the scope of
a nested query).

-Paul

--

Just for clarification... in the BooleanQuery Conjunction case we pick the
least costly clause first. That mean is you have N terms in the boolean
query the one with the lowest document frequency will dominate the
intersection. In the near future lucene might gain a "cost" API that makes
that easier to detect ie. if a clause is a complex query I had a patch for
that but I don't recall the issue id at this point.

simon

On Thursday, November 8, 2012 8:06:02 AM UTC+1, P Hill wrote:

On 11/7/2012 3:12 AM, Clinton Gormley wrote:

On Wed, 2012-11-07 at 02:42 -0800, Tanguy wrote:

So, most restrictive filters at first?
For "and" yes. Least restrictive first for "or"

How would you (or the ES evaulation) define less/more restrictive? Are
you suggesting that ES
picks which one to do 1st or that that I should put what I think is the
most restrictive as the 1st expression of a "and" filter.

How would the evaluation order effect a scope defined in a child or
nested query? (Igor Motov and I where just discussing the issue of
evalution order in another thread, but didn't know the answer, so
couldn't work out why I was getting unexpected results from the scope of
a nested query).

-Paul

--

On Wed, 2012-11-07 at 23:05 -0800, P.Hill wrote:

On 11/7/2012 3:12 AM, Clinton Gormley wrote:

On Wed, 2012-11-07 at 02:42 -0800, Tanguy wrote:

So, most restrictive filters at first?
For "and" yes. Least restrictive first for "or"

How would you (or the ES evaulation) define less/more restrictive? Are
you suggesting that ES
picks which one to do 1st or that that I should put what I think is the
most restrictive as the 1st expression of a "and" filter.

So if you have two sub-clauses:

  • status == 'active' 98% of docs
  • tag == 'foo' 3% of docs

The first clause should be:

  • and: tag == foo
    (you avoid running the status clause on 97% of docs)

  • or: status == active
    (98% of docs accepted based just on that clause)

clint

--

So back to the original query: How does such logic apply in bool
query, if at all?
Does the system choose?
How do "should", "must" and "must_nots" combine? In what order?

I ask because I have been struggling with getting just the scope I need
out of nested query and don't understand how to get something selected
in the parent object AND then select just the right
nested objects. It seems to depend on some unstated or mis-understood
rules. The example in the docs does have anything with the nested query.

-Paul

--

Paul,

there are no misunderstood rules or things like that. The result will
always be the same no matter in what order you pass in the boolean clauses.
The only thing that we do internally is optimizations to make the query
evaluation faster.

simon

On Friday, November 9, 2012 12:32:40 AM UTC+1, P Hill wrote:

So back to the original query: How does such logic apply in bool
query, if at all?
Does the system choose?
How do "should", "must" and "must_nots" combine? In what order?

I ask because I have been struggling with getting just the scope I need
out of nested query and don't understand how to get something selected
in the parent object AND then select just the right
nested objects. It seems to depend on some unstated or mis-understood
rules. The example in the docs does have anything with the nested query.

-Paul

--

So the order of the "should" clauses also have no affect? Are they all
executed or does the first match take without considering the others? (In
case min_should_match is 1) ?

On Fri, Nov 9, 2012 at 10:25 AM, simonw
simon.willnauer@elasticsearch.comwrote:

Paul,

there are no misunderstood rules or things like that. The result will
always be the same no matter in what order you pass in the boolean clauses.
The only thing that we do internally is optimizations to make the query
evaluation faster.

simon

On Friday, November 9, 2012 12:32:40 AM UTC+1, P Hill wrote:

So back to the original query: How does such logic apply in bool
query, if at all?
Does the system choose?
How do "should", "must" and "must_nots" combine? In what order?

I ask because I have been struggling with getting just the scope I need
out of nested query and don't understand how to get something selected
in the parent object AND then select just the right
nested objects. It seems to depend on some unstated or mis-understood
rules. The example in the docs does have anything with the nested query.

-Paul

--

--

On Friday, November 9, 2012 9:48:52 AM UTC+1, Rotem wrote:

So the order of the "should" clauses also have no affect? Are they all
executed or does the first match take without considering the others? (In
case min_should_match is 1) ?

they are all advanced and are considered for scoring in that case.

simon

On Fri, Nov 9, 2012 at 10:25 AM, simonw <simon.w...@elasticsearch.com<javascript:>

wrote:

Paul,

there are no misunderstood rules or things like that. The result will
always be the same no matter in what order you pass in the boolean clauses.
The only thing that we do internally is optimizations to make the query
evaluation faster.

simon

On Friday, November 9, 2012 12:32:40 AM UTC+1, P Hill wrote:

So back to the original query: How does such logic apply in bool
query, if at all?
Does the system choose?
How do "should", "must" and "must_nots" combine? In what order?

I ask because I have been struggling with getting just the scope I need
out of nested query and don't understand how to get something selected
in the parent object AND then select just the right
nested objects. It seems to depend on some unstated or mis-understood
rules. The example in the docs does have anything with the nested
query.

-Paul

--

--

On 11/9/2012 5:53 AM, simonw wrote:

On Friday, November 9, 2012 9:48:52 AM UTC+1, Rotem wrote:

So the order of the "should" clauses also have no affect? Are they
all executed or does the first match take without considering the
others? (In case min_should_match is 1) ?

they are all advanced and are considered for scoring in that case.

simon

"Advanced"?

Maybe my question is not clear. Unlike on the example on the nested
query page I have a nested query imbedded in a bool query of parent
object. The order of processing for the parent objects is what I'm
wondering.
When I set a scope in the nested query, what combination of parent and
nested object queries will have contributed to the resulting scope?
Maybe all of the parent queries, so only then from that set of parents
consider the nested objects and come up with a result set? I doubt
that, because how would two nested queries both in a must clause be
evaluated in their entirity and would nested objects would be in a
scope of the nested objects?

-Paul

--

On Sunday, November 11, 2012 9:59:28 AM UTC+1, P Hill wrote:

On 11/9/2012 5:53 AM, simonw wrote:

On Friday, November 9, 2012 9:48:52 AM UTC+1, Rotem wrote:

So the order of the "should" clauses also have no affect? Are they all
executed or does the first match take without considering the others? (In
case min_should_match is 1) ?

they are all advanced and are considered for scoring in that case.

simon

"Advanced"?

Maybe my question is not clear. Unlike on the example on the nested query
page I have a nested query imbedded in a bool query of parent object. The
order of processing for the parent objects is what I'm wondering.
When I set a scope in the nested query, what combination of parent and
nested object queries will have contributed to the resulting scope? Maybe
all of the parent queries, so only then from that set of parents consider
the nested objects and come up with a result set? I doubt that, because
how would two nested queries both in a must clause be evaluated in their
entirity and would nested objects would be in a scope of the nested
objects?

hold on. this thread is about

"When using a should clause of a bool query - what is the order in which
the parts of the should clause are executed? Can I count on them being
executed in the order I provide? (That is - if the 'should' contains A or B
or C, will it be executed in that order?)"

how are you coming up with parent child here? It would be great if you
could ask your question in another thread. Me and potentially others have a
hard time to follow if threads are hijacked. No offence here really I'd
love to help but can you ask it on a dedicated thread?

simon

-Paul

--

On 11/12/2012 12:45 AM, simonw wrote:

On Sunday, November 11, 2012 9:59:28 AM UTC+1, P Hill wrote:

On 11/9/2012 5:53 AM, simonw wrote:
On Friday, November 9, 2012 9:48:52 AM UTC+1, Rotem wrote:

    So the order of the "should" clauses also have no affect? Are
    they all executed or does the first match take without
    considering the others? (In case min_should_match is 1) ?


they are all advanced and are considered  for scoring in that case.

simon
"Advanced"?

Maybe my question is not clear.  Unlike on the example on the
nested query page I have a nested query imbedded in a bool query
of parent object.  The order of processing for the parent objects
is what I'm wondering.
When I set a scope in the nested query, what combination of parent
and nested object queries will have contributed to the resulting
scope?   Maybe all of the parent queries, so only then from that
set of parents consider the nested objects and come up with a
result set?  I doubt that, because how would two nested queries
both in a must clause be evaluated in their entirity and would
nested objects would  be in a scope of the nested objects?

hold on. this thread is about

"When using a should clause of a bool query - what is the order in
which the parts of the should clause are executed? Can I count on them
being executed in the order I provide? (That is - if the 'should'
contains A or B or C, will it be executed in that order?)"

What does it matter the order?
Answer 1: Because the caller wants to declare the query so the most
narrowing selection goes 1st to make things go faster, because the
requester wants to limit the documents under consideration as early as
possible.
Answer 2: Because the caller actually wants to view and analyze part of
the results, a "scope" that went into the overall query, because they
think this might have a use, but without a definition of how things are
combined they are not sure. The obvious examples of this is a child or
nested query that happens to be in a bool query.

"they are all advanced and are considered for scoring in that case."
Is not an answer to either motivation to understand the order of
execution.

Ok, I'll stay on topic which was "bool query search order".
Now that it has been stated that term queries in shoulds are analyzed to
pick which one is most restricted, what of other queries in the should
list? It seems to me that answering the overall question about should
with a discussion of just picking between term queries is only a start
toward a complete answer. After all there are a couple of dozen other
possibilities that might be in a should clause. Is there some obvious
ranking of such sub-queries?
For example, I probably wouldn't be wanting to do a lot of prefix
queries without first restricting myself to as few docs as possible, via
good term/terms queries. On 1st thought, I'd want phrases and spans
processed after terms. But then maybe someone else has a different case.
What of another bool query that is a "sub experssion" in the shoulds?
How would that enter into any choice and ordering?

And what of "must" and "must not"? Should we expect them to be executed
before/after the shoulds?
What would I do if I wanted them in a different order? The only answer
that seems to be well defined is that filters come after queries, so
true ordering is currently only well defined by putting a query in the
filter of a filtered query.

-Paul

--