Lucene’s query parser (used to parse “query_string” parts of json) is perhaps a little different to regular Boolean logic.
TheTL/DR is you are advised to put brackets around almost everything for ANDs/ORs to make any sense.
To understand the reason why it helps to understand the background story. Lucene is unlike databases that use a binary matching logic where database records either match a query or don’t. Instead Lucene is designed to match document searches to varying degrees. It can take a bag of search terms and relevance rank documents based on
- the number of search terms that matched
- how rare the matching terms are overall
- how frequently the terms are repeated in the document.
The more of these boxes that get ticked by a document the higher its relevance score.
This matching mode is suited to matching free text with various options. However, sometimes users want to add mandatory clauses (products MUST be in my price range or MUST_NOT contain meat). As soon as a Boolean query has any of these mandatory clauses added, the optional parts are relegated to being 100% optional - (but still “preferable”).
If you don’t have any mandatory clauses in a Boolean query then at least one of the listed optional clauses has to match (otherwise you’d match documents that were entirely irrelevant.
Let’s build up with some examples
elasticsearch
only matches Elasticsearch
elasticsearch OR elastic
matches documents that contain either of these words and preferably both
elasticsearch OR elastic AND search
is weird. Because we introduced a mandatory clause using AND the effect can be surprising. Lucene’s parser sees this as 2 MUST clauses (elastic and search) and one entirely optional extra-points-if-you-have-it clause (Elasticsearch). In The more verbose JSON syntax the parsed bool query is effectively
Bool
Must
Elastic
Search
Should
Elasticsearch
So a document containing only “Elasticsearch” would not match.
The solution is to wrap the parts in brackets to make multiple Boolean expressions eg
elasticsearch OR (elastic AND search)
This is parsed into:
Bool
Should
Elasticsearch
Bool
Must
Elastic
Search
Note the root bool only has 2 optional clauses and no mandatory ones which means the logic is it has to match at least one.
Horribly complex I know, but the moral of the story is “use brackets” when mixing ANDs with ORs. This is good practice with other databases anyhow.