Hi,
I just added custom analyzer with char_group tokenizer to my index and I was supprised when I found out that I'm able to include in searches all characters specified in tokenize_on_chars except dot. Obviously all the characters are stripped when tokens are created so I would expect that either I need to specify in search term exact tokens that were created or I can include any of characters that were stripped when tokenizing. So for example if I would search for 2019-0 on a field that consits date in format 2019-01-04 it will find it out. On the other side when I would search for 2019.0 on a field that consists date in format 2019.01.04 that it will not find it.
So my question is if it's possible to include dots in searches?
My index looks like this:
PUT _template/template_name
{
"index_patterns": ["template_pattern-*"],
"settings" : {
"index" : {
"number_of_shards" : "1",
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 30
}
},
"analyzer": {
"my_custom_analyzer": {
"type": "custom",
"tokenizer": "my_tokenizer",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
},
"tokenizer": {
"my_tokenizer": {
"type": "char_group",
"tokenize_on_chars": [ "whitespace", "-", "\n", "[", "]", ".", ",", ":", ";", "/" ]
}
}
}
}
}