Hey guys
We've upgraded from 6.8 to 7.10 recently and are still finding out issues. It seems like minimum_should_match
behaviour has been silently updated.
Consider the following repeatable example:
PUT /test-bench
{
"mappings": {
"properties": {
"label": { "type": "text" },
"object": {
"type": "nested",
"properties": {
"name": {
"type": "text"
}
}
}
}
}
}
POST /test-bench/_doc/1
{
"label": "how are you doing",
"object": {
"name": "how are you doing"
}
}
GET /test-bench/_search
{
"query": {
"match_all": {}
},
"highlight": {
"fields": {
"label": {},
"object.name": {}
}
}
}
GET /test-bench/_search
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "object",
"query": {
"match_phrase": {
"object.name": {
"query": "how long does take"
}
}
}
}
}
],
"must": [
{
"term": {
"_id": {
"value": "1"
}
}
}
]
}
},
"highlight": {
"fields": {
"label": {},
"object.name": {}
}
}
}
One would expect this to not return a result, but it does. Not only that, but the highlight doesn't make any sense - match_phrase
shouldn't highlight partial matches.
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "test-bench",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"label" : "how are you doing",
"object" : {
"name" : "how are you doing"
}
},
"highlight" : {
"object.name" : [
"<em>how</em> are you doing"
]
}
}
]
}
}
Forcing minimum_should_match
value to 1
makes this to behave as expected. I wonder why that might be. I see previous responses claiming this shouldn't be needed.
Could someone shed light on this?
Is it that starting with Elasticsearch 7.0 minimum_should_match
changes default to 0
the moment we have must
clause added? A prompt response would be appreciated.
Thanks,
Nick