How can I use "minimum_should_match" correctly?

Are these two syntaxes not equivalent?

  • first
GET doc_data/_search
{
  "query": {
    "match": {
      "title": {
        "query": "农村信用银行",
        "minimum_should_match": 2
      }
    }
  }
}
  • second:
GET doc_data/_search
{
  "query": {
    "bool": {
      "should": [
        { "term": { "title":{"value": "农村信用"}}},
        { "term": { "title":{"value": "银行"}}},
        { "term": { "title":{"value": "农信银"}}}
      ],
      "minimum_should_match": 3
    }
  },
  "_source": ["title"]
}

But, I can get the hits by using the second, and the first result is empty .

And, cause I hava a synonym.txt with one line: "农村信用银行, 农信银",
so analyze "农村信用银行", I got three tokens, like this:

GET doc_data/_analyze
{
  "field": "title",
  "text": "农村信用银行"
}
{
  "tokens" : [
    {
      "token" : "农村信用",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "农信银",
      "start_offset" : 0,
      "end_offset" : 6,
      "type" : "SYNONYM",
      "position" : 0,
      "positionLength" : 2
    },
    {
      "token" : "银行",
      "start_offset" : 4,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 1
    }
  ]
}

So, why I can get results with second, but empty with first?

PS: What exactly does "minimum_should_match" mean when I use "multi_match"?

Thank you for any reply.

Second it works because you are getting results from all the terms query under should where minimum match is 3. First it fails because your query has only one search criteria and the minimum should match is greater than 1 (in your case 2). Hope it helps.

Ideally should match 3 out 3 in second case and it does so you are getting results and in first its 1 out of 2 which is why its failing.

Thank you your reply.

But why the first has only one search criteria?

Did you mean the first equal to :

GET doc_data/_search
{
  "query": {
    "bool": {
      "should": [
        { "term": { "title":{"value": "农村信用银行"}}}
      ],
      "minimum_should_match": 2
    }
  },
  "_source": ["title"]
}

?

yes. For the below use case, you do not require minimum_should_match because you only have one search term.

GET doc_data/_search
{
  "query": {
    "bool": {
      "should": [
        { "term": { "title":{"value": "农村信用银行"}}}
      ],
      "minimum_should_match": 2
    }
  },
  "_source": ["title"]
}

But why.

I found a strange phenomenon when I use synonym and chinese words. I'll show you:

(notice I use profile, let's see the results)

GET doc_data/_search
{
  "query": {
    "match": {
      "title": {
        "query": "农村信用银行",
        "minimum_should_match": 2
      }
    }
  },
  "_source": ["title"],
  "profile": true
}
  1. without synonym: hits has the data I want. I only put "profile" in here.

response: (see the description field)

  "profile" : {
    "shards" : [
      {
        "id" : "[Ph0SrX_qSsGKjXBC7xn-Pw][doc_data][1]",
        "searches" : [
          {
            "query" : [
              {
                "type" : "BooleanQuery",
                "description" : "(title:农村信用 title:农村 title:信用 title:银行)~2",
                "time_in_nanos" : 11748,
                "breakdown" : {

  1. with synonym : 农村信用银行,农信银. hits has no data I want.

response: (see the description field)

  "profile" : {
    "shards" : [
      {
        "id" : "[Ph0SrX_qSsGKjXBC7xn-Pw][doc_data][0]",
        "searches" : [
          {
            "query" : [
              {
                "type" : "BooleanQuery",
                "description" : """+(((title:"农村信用 农村 信用 银行" title:农信银))~2) #DocValuesFieldExistsQuery [field=_primary_term]""",
                "time_in_nanos" : 88444,
                "breakdown" : {

So, why I got "+(((title:"农村信用 农村 信用 银行" title:农信银))~2) " with synonym, and "(title:农村信用 title:农村 title:信用 title:银行)~2" without synonym?

I mean, when I use synonym, why I can't get "(title:农村信用 title:农村 title:信用 title:银行 title: 农信银)~2"?

Hope you understand what I mean. I'm new to es. thankyou.

Can you just trying the first request without "minimum_should_match": 2 and check. If that does not work can you pass me the index settings (GET doc_data/) so that I can check. I am not versed with chinese to understand what the content actually means. Hope it helps

GET doc_data/_search
{
  "query": {
    "match": {
      "title": {
        "query": "农村信用银行",

      }
    }
  },
  "_source": ["title"],
}

First request without "minimum_should_match": 2, I can fetch hits with data. I know if without "minimum_should_match": 2, it means "minimum_should_match": 1, right?

I did a lot of tests, and I found the phenomenon (even though I don't konw the reason yet) that if I set a long string (in chinese means many characters, like "农村信用银行") and a short word (like "农信银") as synonyms, es will take the long string as a "WORD".

Thank you very much.

And I want to know how the underlying logic of es handles the below two query:
1.

{
  "query": {
    "match": {
      "title": {
        "query": ""
      }
    }
  }
}
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "",
            "fields": ["title", "content"],
            "minimum_should_match": 2
          }
        }
      ]
    }
  }
}

Are they all converted to bool should? like:

{
  "query": {
    "bool": {
      "should": [
        { "term": { "title":{"value": ""}}},
        { "term": { "title":{"value": ""}}},
        { "term": { "title":{"value": ""}}}
        ...
      ],
      "minimum_should_match": 2
    }
  }
}

Where and how can I get the knowledge about these?

Thanks again

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.