Synonym inconsistent analyze result

girlie.tan · May 19, 2017, 2:46am

hi

I have synonym text file having full-width and half-width digits and Chinese character equivalent of the digits.

example of one row in MySynonym.txt is "7,７,七,柒".

my settings for filter has
"local_synonym_test": {
"type": "synonym",
"synonyms_path": "analysis/MySynonym.txt"
}

my settings for analyzer has
"my_ngram_analyzer_test": {
"type": "custom",
"tokenizer": "my_ngram_tokenizer",
"filter": [
"local_synonym_test",
"dash_as_alphanum_fltr"
]
}

my settings for tokenizer has
"my_ngram_tokenizer" :{
"type": "ngram",
"token_chars": [
"letter",
"digit"
],
"min_gram": "1",
"max_gram": "1"
}

when i execute _analyze multiple times, i get inconsistent synonym results:
GET mytestindex/_analyze
{
"analyzer" : "my_ngram_analyzer_test",
"text" : "7杯"
}

sometimes the result is RESULT1:
returns only 7 at position 0.

sometimes the result is RESULT2:
returns all 7,７,七,柒 at postiion 0.

in Windows environment i get consistent RESULT2, but it is not the case in CentOs environment.

what should i do to make it consistent to return RESULT2 in CentOs environment?

thank you.

system · June 16, 2017, 2:47am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch synonyms Elasticsearch	2	858	July 6, 2017
Synonym filter behavior for single word / multi words Elasticsearch	5	714	July 6, 2017
Problem with synonym search Elasticsearch	4	477	July 6, 2017
Synonym filter not working query time? Elasticsearch	6	1269	July 6, 2017
Help with Synonyms Elasticsearch	6	512	July 6, 2017

Synonym inconsistent analyze result

Related topics