Removing redundancy in query

Can this query:

{
"query": {
"bool": {
"must": [
{ "match": { "title": "elasticsearch" } },
{ "match": { "author": "jondoe" } },
{ "range": { "price": { "gte": 10, "lte": 100 } } }
],
"must_not": [
{ "match": { "author": "jondoe" } },
{ "match": { "cover": "black" } }
]
}
}
}

be optimized (removing redundancy) by rewriting it like this:

{
"query": {
"bool": {
"must": [
{ "match": { "title": "elasticsearch" } },
{ "range": { "price": { "gte": 10, "lte": 100 } } },
{ "match": { "author": "jondoe" } }
],
"must_not": [
{ "match": { "cover": "black" } }
]
}
}
}

1/ I want to understand if this is possible
2/ would it help with reducing processing time

Hi @searchwithme,

The key difference between your 2 queries looks to be the inclusion of

in the must and must_not clauses.

Have you compared the queries using the Search Profiler and checked the results returned by each query? Can you explain more what documents you expect to return?

1 Like

In this particular case with the conflicting clauses (jondoe must and must not exist) it would be impossible for an optimisation algorithm to automatically know which of the 2 clauses you would want to keep.
Assisting with change rather than automating would be of use here and one area I have experimented with is using custom aggregations and Sankey diagrams to show how individual clauses in a given query match (or don’t) and this would have red-flagged your must_not clause as rejecting 100% of the documents fed to it by all the other query clauses.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.