Synonym Filter


(paul) #1

Hi,

My Synonym file contains the entry as below

MIT,Massachusetts Institute of Technology

My setting is as below:

"settings":{
"analysis":{
"analyzer":{
"synonym":{
"tokenizer":"my_pipe_analyzer",
"filter":[
"lowercase",
"syns_filter"
]
},
"my_pipe_analyzer":{
"tokenizer":"my_pipe_analyzer"
},
"autocomplete_search":{
"type":"custom",
"tokenizer":"my_pipe_analyzer",
"filter":[
"lowercase",
"syns_filter",
"stop"
]
}
},
"tokenizer":{
"my_pipe_analyzer":{
"type":"pattern",
"pattern":"\|"
}
},
"filter":{
"syns_filter":{
"synonyms_path":"synonyms/synonym_collegename.txt",
"type":"synonym",
"ignore_case":true
}
}
}
}

I have created a pipe separated tokanizer so that the synonyms are not
split on spaces still it is getting split on spaces when i verify it with
the analyze API , below is my output from
analyzer api.

{
"tokens":[
{
"token":"mit",
"start_offset":0,
"end_offset":3,
"type":"SYNONYM",
"position":1
},
{
"token":"massachusetts",
"start_offset":0,
"end_offset":3,
"type":"SYNONYM",
"position":1
},
{
"token":"institute",
"start_offset":0,
"end_offset":3,
"type":"SYNONYM",
"position":2
},
{
"token":"technology",
"start_offset":0,
"end_offset":3,
"type":"SYNONYM",
"position":4
}
]
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7516d1a7-72d0-4b3f-b426-deb80b8d6450%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(sina.tamanna) #2

Hey,

Synonym filter has its own tokenizer which is not the same as one defined
for synonym analyzer. You need to define the tokenizer inside the synonym
filter:

"filter":{
"syns_filter":{
"synonyms_path":"synonyms/synonym_collegename.txt",
"type":"synonym",
"tokenizer":"keyword",
"ignore_case":true
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3390e6e4-5f0e-448a-bcc2-e3385200731b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3