Goal
Enable spelling corrections for tokenized fields
Settings:
{
"analyzer": {
"analyzer_search": {
"filter": [ "standard", "lowercase" ],
"char_filter": [ "html_strip" ],
"type": "custom",
"tokenizer": "standard"
},
"analyzer": {
"filter": [ "standard", "lowercase" ],
"char_filter": [ "html_strip" ],
"type": "custom",
"tokenizer": "ngram_tokenizer"
}
},
"tokenizer": {
"ngram_tokenizer": {
"token_chars": [ "letter", "digit" ],
"min_gram": "3",
"type": "nGram",
"max_gram": "20"
}
}
}
Original mapping
{
"mappings": {
"searchitemdocument": {
"properties": {
"title": {
"search_analyzer": "analyzer_search",
"search_quote_analyzer": "analyzer",
"analyzer": "analyzer",
"type": "text"
},
}
}
}
}
Result
This gives suggestions from the ngrams, ie. suggesting service
-> servic
, which is not a word.
I have solved this by creating an additional field with a different analyzer, like this:
New mapping for suggestions
{
"mappings": {
"searchitemdocument": {
"properties": {
"title": {
"search_analyzer": "analyzer_search",
"search_quote_analyzer": "analyzer",
"analyzer": "analyzer",
"type": "text"
},
"title_suggest": {
"analyzer": "no_suggest_analyzer",
"type": "text"
}
}
}
}
}
I can now use the new field for term suggestion search.
This works, but I now insert all data twice.
Is there a better way?