Synonym using a file is not working: malformed_input_exception

Hi i m trying to use synonym file in ES 2.3.4.

from the documentation i used the following command in SENSE:

POST /mise/
{
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms_path" : "sinonimi/synonym.txt"
}
}
}
}
}

I want all thje synonyms to be equivalent. Below my synonym.txt file format:

"abate,priore,superiore",
"abbacchiare,avvilire,deprimere",
"abbacchiarsi,abbattersi,abbiosciarsi,accasciarsi,avvilirsi,deprimersi,disperarsi,scoraggiarsi,sgomentarsi",
"abbacchiato,abbattuto,accasciato,afflitto,affranto,annientato,costernato,demoralizzato,depresso,in crisi,infelice,malinconico,mogio,prostrato,sconfortato,scoraggiato,scorato,sfiduciato,triste",
"abbacchio,agnello"

I get the following error message:

{
"error": {
"root_cause": [
{
"type": "index_creation_exception",
"reason": "failed to create index"
}
],
"type": "illegal_argument_exception",
"reason": "failed to build synonyms",
"caused_by": {
"type": "malformed_input_exception",
"reason": "Input length = 1"
}
},
"status": 400
}

Could you please help me out?

Cannot get around this issue.

Thanx valerio

You did not read the doc?

https://www.elastic.co/guide/en/elasticsearch/reference/2.4/analysis-synonym-tokenfilter.html#_solr_synonyms

Yes i did it!

It seems the error is related to the synonim itself.
I meas if use 2 words for a synonym it doesnt work.

For example the synonym row:

"abbattere,ammainare,peggiorare,tirare giu"

the term "tirare giu" causes the error. IT wants just one token.

Is that right?

valerio

Try:

abbattere, ammainare, peggiorare, tirare giu

Or:

abbattere, ammainare, peggiorare, tirare giu => abbattere

Hi David and thanx.

I must correct myself. Acutally it's not the 2 tokens synonym to create the error but the diacritic.

the original row was:

abbattere, ammainare, peggiorare, tirare giù => this causes the malformed error because of the ù

If ireplace ù with u' it works!

abbattere, ammainare, peggiorare, tirare giu' => OK

Therefore i think i'm going to replace all the diacritics

Thanx valerio

I m wondering whether it is possible defining an ascii_folding filter(to remove diacritics )for the synonyms filter?

Thanx valerio

You can apply the asciifolding token filter before the synonym filter in the filter chain when you define an analyzer.
So tirare giù will become tirare giu before reaching the synonym token filter.