I had a look into this using the validate query API.
Here's the command to debug aa OR bb AND cc
:
GET githubcommits/_validate/query?q=aa+OR+bb+AND+cc&rewrite=true&df=myfield
The result is:
"explanation" : "myfield:aa +myfield:bb +myfield:cc"
Lucene's Boolean query has the idea of mandatory must
clauses and should
clauses which are just nice-to-haves. In the above query aa
is relegated to a wholly optional should
clause that gives extra scoring points to documents that contain both of the mandatory must
clauses bb and cc.
If you want to have pure OR clauses in Lucene you need to use a Boolean query with should
clauses but no must
clauses. Something like this:
bool
should
aa
bool
must
bb
cc
Note the use of a nested bool query above to get the required logic.
The introduction of brackets in query_string syntax forces the creation of these sub boolean clauses and makes the logic behave in a more predictable way.
Weird, but I'd hesitate to call it a bug - more a quirk of Lucene.
For readability's sake alone I would advocate using brackets to make the logic clear.