Synonyms not working with diacritic chars


#1

Hi,

I'm using ES 5.5 and I'm unable to put some words into the synonyms file (støvmaske, buntebånd, etc).

When putting the filter

{
	"settings": {
		"analysis": {
			"analyzer": {
				"synonym_analyzer": {
					"type": "custom",
					"filter": ["lowercase", "asciifolding", "trim", "my_synonyms"],
					"tokenizer" : "standard"
				}
			},
			"filter": {
				"my_synonyms": {
					"type": "synonym",
					"synonyms_path": "analysis/Norway/synonyms.txt",
					"tokenizer" : "standard"
				}
			}
		}
	}
}

I'm getting the response:

{
    "error": {
        "root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "failed to build synonyms"
            }
        ],
        "type": "illegal_argument_exception",
        "reason": "failed to build synonyms",
        "caused_by": {
            "type": "malformed_input_exception",
            "reason": "Input length = 1"
        }
    },
    "status": 400
}

Those settings are fine when synonym file contains "standard" words.

What am I missing?


#2

It turned out that I needed to change a file encoding to UTF-8 and it started working correctly.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.