"minimum_should_match":"100%" vs "type": "phrase"

Hi,

I'm new in Elasticsearch query and I'm wondering what is the difference between

"minimum_should_match":"100%" and "type": "phrase"?
example:

{
    "match": {
        "tags": {
            "query": "My text",
            "minimum_should_match":"100%"
        }
    }
}

and

{
    "match": {
        "tags": {
            "query": "My text",
            "type": "phrase"
        }
    }
}

The "minimum_should_match": "100%" becomes a bool query with term queries in the must array. Er, well, it doesn't become that, but at some layer of abstraction they are the same.

The "type": "phrase" one becomes a phrase query.

The minimum_should_match one will be faster because it just checks that terms are in the text you are matching. The phrase one will be slower because it makes sure that the terms are close together. The "how close together?" setting is slop.

The two queries are scored a bit differently too: since the minimum_should_match one never gets positional information it just uses frequencies of the terms. Because the phrase one does get positional information it uses the relative closeness of the terms and the frequencies.

"How much slower?" and "how different is the scoring?" are kind of fuzzy and dependent on the data you are working with, the RAM you have free for the disk cache, whether positional information is hot in the cache, if you've recorded offsets in the positions, and lots of other stuff I'm forgetting right now.

1 Like