Hi guys, I have a problem with my analyzer. Following is the setup in
elasticsearch.yml:
analyzer :
default_index :
type : custom
tokenizer : whitespace
filter : [word_delimiter, asciifolding, standard,
lowercase, synonym, edgeNGram]
filter :
word_delimiter :
type : word_delimiter
preserve_original : true
edgeNGram :
type : edgeNGram
min_gram : 2
max_gram : 15
side : front
The problem is that when I run curl -XGET
'localhost:9200/my_index/_analyze?field=email_doc.body' -d
'WordWithOverFifteenChars', the result is 'WordWithOverFifteenChars' broken
down into n-grams up to 15 chars, but the whole word does not get indexed.
The same happens for words like 'something.company', that means the whole
world (preserve_orginal settings for word_delimiter filter) doesn't get
preserved, it only gets broken down into n-grams as a whole up to
'something.compa' (15 chars) as well as n-grams for 'company' and
'something' are created.
Could you give me your insight on what I am missing?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.