Exact + partial match of text documents (with bool query?)

Hi friends! I am building a news search engine and need a way to construct ES queries that delivers relevant results. Currently, I am using a combination of must and should clauses in a bool query for every news topic a user searches. My query looks as follows (written with elasticsearch_dsl in Python):

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, MultiSearch, Q
...
self.search = self.search.query(
        "bool",
        must=[
            Q(
                "multi_match",
                query="Andrew Yang",
                fields=["title", "summary"],
                type="best_fields",
                tie_breaker=0.5
            )
        ],
        should=[
            Q(
                "multi_match",
                query="New Hampshire debate",
                fields=["title", "summary"],
                type="best_fields",
                tie_breaker=1.0
            ),
            Q(
                "multi_match",
                query="election",
                fields=["title", "summary"],
                type="best_fields",
                tie_breaker=1.0
            ),
            ...
        ],
        minimum_should_match=self.min_should_match
    )

I initially wished that this query will match exactly the query in must (in this example, "Andrew Yang") and will attempt to match as many documents with queries words in should as possible. However, it appears to me that Elasticsearch is not doing strict / exact match with search terms in the must clause. This brings up problems when two entirely different news topics have partially similar keywords (for example, Andrew Yang and Prince Andrew) as when user search one of them, the other has the potential to appear as well..

My question: is there a way to do both exact match one or more certain keywords (let's call them group one keywords) and partial match (like multi-match in the should clauses) of some other keywords (let's call them group two keywords) in a single query so that results returned will definitely contain every group one keyword while trying to match as many group two keywords as possible? If so, what's the best way to structure such a query (so that it's not terribly inefficient)?

P.S. Both the title and summary fields are indexed with type=text (standard analyzer). Here is the full mapping that I am currently using:

NEWS_INDEX_MAPPING = {
    "mappings": {
        "properties": {
            "title":    { "type": "text" },  
            "source": {"type": "keyword"},
            "category": {"type": "text"},
            "id": { "type": "keyword" },
            "summary":  { "type": "text"  }, 
            "url": { "type": "keyword" },
            "published_date": { "type": "keyword" },
            "img_url": { "type": "text" },
            "views": {"type": "long"},
            "avg_rating": {"type": "float"},
            "num_rated": {"type": "long"}
        }
    }
}

Will appreciate any suggestion / comment! Thanks!

By default multi word search runs as an OR condition. Use the "operator" : "and" with your query to ensure that all words in input are being matched. Please note that this is not applicable when searching on name, but with fields split as first name and last name.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.