The index time partifier is different from the word breaker of the query time partifier


(Athena) #1

when we excute this search, we can't found anything
GET rsearch/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "CoreALM智慧云",
"fields": ["title"],
"minimum_should_match": "100%"
}
}
],
"filter": {
"term": {
"from": "CoreALM"
}
}
}
},
"explain": true
}
But when we reduce the minimum_should_match to 50%,
we can search the document which title is "【Story】CoreALM智慧云搜索:" then we find the score
only the word"CoreALM" have score like this, we have't get the score of the word "智慧云"
"description": "weight(title:corealm in 2404989) [PerFieldSimilarity],

<1>
I defined title in the mapping as that: al_sysnonym is based on query_ansj;
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "index_ansj",
"search_analyzer": "al_synonym"
}
<2>
i use the index_ansj for the keyword:"CoreALM智慧云" to get the index-vector
{
"tokens": [
{
"token": "corealm",
"start_offset": 0,
"end_offset": 7,
"type": "en",
"position": 0
},
{
"token": "智慧云",
"start_offset": 7,
"end_offset": 10,
"type": "userDefine",
"position": 1
},
{
"token": "智慧",
"start_offset": 7,
"end_offset": 9,
"type": "userDefine",
"position": 2
}
]
}
<3> when i used the search-analyzer on the keyword:"CoreALM智慧云", we get:
{
"tokens": [
{
"token": "corealm",
"start_offset": 0,
"end_offset": 7,
"type": "en",
"position": 0
},
{
"token": "智慧云",
"start_offset": 7,
"end_offset": 10,
"type": "userDefine",
"position": 1
},
{
"token": "研发智慧云",
"start_offset": 7,
"end_offset": 10,
"type": "SYNONYM",
"position": 1
},
{
"token": "athena",
"start_offset": 7,
"end_offset": 10,
"type": "SYNONYM",
"position": 1
}
]
}