Could we have conditional N-gram token filter? Tell me yes or no ?
More explanation : I have a text "The animal is cute"
I want to know if it is possible to have min_gram=3
if word.length>=3
and min_gram=2
if word.length< 3 ?!
Invert-indexing all tokens/words with min_gram=2 could have many cost on server and min_gram=3 results in missing some small words in Persian.
Any hint is appreciable
I would be so thankful if your answer is yes, give me an example.
This is my current code using Nest and I need to know if I can change N-gram token filter for different words
var suggestionindexResponse = await client.Indices.CreateAsync("suggestionindex", c => c
.Settings(s => s
.Setting(UpdatableIndexSettings.MaxNGramDiff, 8)
.Analysis(a => a
.CharFilters(cf => cf.Mapping("mycharfilter", cf => cf.Mappings(new[] { "\\u200C=>\\u0020" }))) //convert somthing like half space (Ctrl + - or shift+ctrl+4) to space
.Analyzers(aa => aa
.Custom("suggestionanalyzer", ca => ca
.CharFilters("mycharfilter")
.Filters(new List<string> { "lowercase", "my_stopword", "mynGram", "decimal_digit", "arabic_normalization", "persian_normalization" })
.Tokenizer("standard"))
.Custom("suggestionsearchanalyzer", ca => ca
.CharFilters("mycharfilter")
.Filters(new List<string> { "lowercase", "my_stopword", "decimal_digit", "arabic_normalization", "persian_normalization" })
.Tokenizer("standard"))
)
.TokenFilters(tf => tf
.Stop("my_stopword", st => st.StopWords("_persian_"))
.NGram("mynGram", td => td
.MaxGram(10).MinGram(2))
)
))
.Map<WordSuggestion>(m => m
.Dynamic(DynamicMapping.Strict)
.AutoMap()
.Properties(ps => ps
.Text(s => s.Name(n => n.word).Analyzer("suggestionanalyzer").SearchAnalyzer("suggestionsearchanalyzer"))
.Number(n => n.Name(i => i.Id).Type(NumberType.Integer))
)));