Dear talents,
I hope everyone is doing well. Could you kindly review the statement below and suggest possible fixes?
The keyword is "alt legal".
Currently, the search engine is expected to display titles that either contain the full phrase "alt legal" or both the words "alt" and "legal". As a result, the entry "bluh bluh Alt legal" should be ranked at the top. However, the current top candidate is "LegalType," which includes "alT" and "Legal," but they are not a complete word.
Does someone have experience like this?
What is the mapping of the fields you are querying?
Thanks for your reply @Christian_Dahlqvist
Both of they are string type.
I replaced displayName with title for more comprehension.
What is the mapping of the fields in the index as shown using the get mapping API?
Here is mapping data:
"description" : {
"type" : "text",
"analyzer" : "full",
"search_analyzer" : "full_search"
},
"displayName" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"analyzer" : "full",
"search_analyzer" : "full_search"
},
What is the definition of the "full_search"
and "full"
analysers? Why do you have a separate search analyser that is different from the index analyser?
Well, would this be an interruption of searching for "alt legal"?
Actually, I am a newbie on this search engine
It is key to how the search is executed, so you need to provide this information.
Here is detail.
"analyzer": {
"full": {
"tokenizer": "full",
"filter": [
"lowercase",
"asciifolding",
"english_stop",
"synonym"
],
"char_filter": [
"html_strip"
]
},
"full_search": {
"tokenizer": "search",
"filter": [
"lowercase",
"asciifolding",
"english_stop",
"synonym"
],
"char_filter": [
"html_strip"
]
},
}
Could you please take a look?
What is the definition of your synonym filter?
I think there is no synonym of it in this list
Actually, there is no data that is related to "legaltrek"
Why does this issue happen?
I have not had time to recreate this so would recommend that you use the analyze API to analyze and commpare how the search string and indexed matching data are analyzed.
A couple of notes here. First, you should be aware that if you use a synonym filter last then the synonyms you place in your list will not be filtered. So those synonyms need to be in the final form.
Second, what are search and full tokenizers? There's probably no need for all of these customizations. You should start with out of the box and then make adjustments as necessary to achieve a specific result. My assumption is that y
LegalType is likely getting ngramed. So it is being tokenized ["Leg", "Lega", "Legal", "LegalT", "LegalTy","LegalTyp","LegalType","egalType","galType","alType", "lType","Type", "ype"...."alT"] (note this is not the complete list)
after tokenization you have a token "alT" and "Legal" which are then lowercased. See this: N-gram tokenizer | Elasticsearch Guide [8.14] | Elastic
But as Christian mentioned, use the analyze api to see how your analyzers are actually performing and you'll see the issue.