Hello,
I am an elastic search newbie and I have created a plugin in java, in order to use it as phonetics transformer for greek language. In the following snippet
"analysis": {
"filter": {
"greek_stop": {"type": "stop", "stopwords": "greek"},
"greek_lowercase": {"type":"lowercase", "language": "greek"},
"greek_stemmer": {"type":"stemmer", "language":"greek"},
"phonetics_filter":{"type":"custom_phonetics"}
},
"analyzer": {
"rebuilt_greek": { "tokenizer": "standard", "filter": ["greek_lowercase", "greek_stop", "greek_stemmer"] },
"dummy_phonetics_analyzer":{"tokenizer": "standard", "char_filter":["phonetics_mapping"]},
"phonetics_analyzer":{"tokenizer": "standard", "filter": ["greek_lowercase", "greek_stop", "phonetics_filter"]}
},
"char_filter":{
"phonetics_mapping":{"type": "mapping","mappings": ["οι => i", ..., "αί => E"] }
}
}
rebuilt_greek is a simple analyzer using elasticsearch tokenizer, greek stemmer and stopwords,
phonetics_analyzer is the analyzer which uses my plugin with the java code and dummy_phonetics_analyzer is an analyzer which uses a simple mapping from greek characters to latin, as in char_filter.phonetics_mapping (all definition at char_filter.phonetics_mapping.mappings is ommited).
The field "street_name" has 2 inner fields: phonetics, analyzed with phonetics_analyzer and dummy_phonetics, analyzed with dummy_phonetics_analyzer, i.e.:
"street_name": {"type": "text",
"fielddata": true,
"analyzer": "rebuilt_greek",
"fields":{"phonetics":{"type":"text", "analyzer":"phonetics_analyzer", "search_analyzer":"phonetics_analyzer"},
"dummy_phonetics":{"type": "text","analyzer":"dummy_phonetics_analyzer", "search_analyzer":"dummy_phonetics_analyzer"}}}
Phonetics_analyzer, when _analyze is called it works well:
GET addrs/_analyze
{
"text": "Πατησίων"
, "analyzer": "phonetics_analyzer"
}
result
{
"tokens": [
{
"token": "patisIon",
"start_offset": 0,
"end_offset": 8,
"type": "",
"position": 0
}
]
}
But, when in search, I have the following results:
GET addrs/_search?filter_path=hits.total
{"query":
{"match":
{"street_name":{
"query":
"Πατησίων"}
}
}
}
result
{
"hits": {
"total": 573
}
}
GET addrs/_search?filter_path=hits.total
{"query":
{"match":
{"street_name.phonetics":{
"query":
"Πατησίων"}
}
}
}
result
{
"hits": {
"total": 0
}
}
On the contrary, dummy_phonetics_analyzer seems to work well:
GET addrs/_search?filter_path=hits.total
{"query":
{"match":
{"street_name.dummy_phonetics":{
"query":
"Πατησίων"}
}
}
}
result
{
"hits": {
"total": 573
}
}
What am I doing wrong?
Thank you in advance