Range filter in should match query

Hi, let's say I have 3 documents,

id=1, "text" = "my name is john", "date" = "2021"
id=2, "text" = "my name is alice", "date" = "2017"
id=3, "text" = "my name is steven", "date" = "2019"

I want to find all documents that have one of the exact phrases (e.g. search for all documents that have john OR alice). Used match_phrase query, and it works ok If I don't put range filter on the date field. It works though if I run the post_filter query, but since it's not really recommended cause of performance (aggregations). Is there any "smarter" way of doing this (give me all documents containing one or another phrase in the date range provided)?

Query

{
  "from": 0,
  "size": 10000,
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "text": "john"
          }
        },
        {
          "match_phrase": {
            "text": "steven"
          }
        }
      ]
    }
  },
  "sort": [
    {
      "date": {
        "order": "desc"
      }
    }
  ],
  "post_filter": {
    "range": {
      "date": {
        "gte": "2021",
        "lte": "2022"
      }
    }
  }
} 

This returns only John because it finds John and Steven in the first block and then filters out by date only John which is good. Tried to put filter range in bool must and in should , but it returns both documents. Any ideas or I can only use post_filter in my case?

Thnx

Hi @ansamHox

You can use "filter" for range date and should for match. Like this:
When you use "filter", you will have the benefit of the cache.

POST test/_bulk
{"index":{}}
{"text":"my name is john","date":2021}
{"index":{}}
{"text":"my name is alice","date":2017}
{"index":{}}
{"text":"my name is steven","date":2019}

GET test/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "date": {
              "gte": 2021,
              "lte": 2022
            }
          }
        }
      ],
      "should": [
        {
          "match": {
            "text": "john"
          }
        },
        {
          "match": {
            "text": "alice"
          }
        }
      ]
    }
  },
  "sort": [
    {
      "date": {
        "order": "desc"
      }
    }
  ]
}

In my case this doesn't work, it brings back over 10k results, but without a filter, it brings back only 700 results. And I even put a filter from 2000 to 2022 (that's all documents basically). With post_filter it works, it's also 700 results, that's why I was wondering if there is any other approach. Also, when I try to search in the results for one or another string, for one there are around 300 documents, and for another are around 500, and some of them have both entries so that makes sense that there are 700 hits.

I didn't mention that I have encrypted values with my custom analyzer, but that's why I use match_phrase case 1 "real letter" is encoded in 16 characters, etc.

Here's a solution if anyone is having same issue:

Combining should and filter is problematic without using minimum_should_match - Boolean query | Elasticsearch Guide [8.3] | Elastic ( If the bool query includes at least one should clause and no must or filter clauses, the default value is 1. Otherwise, the default value is 0. )

If I put "minimum_should_match ": 1 , then it works as expected

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.