Combining language-specific analyzer and synonym token filter


(José de Zárate) #1

we use language-specifyc analyzer.
in my understanding, language-specific analyzer is a combination of
standard tokenizer plus customized stop words plus language-specific
stemming (I don't know if that last thing is done through snowball, though).

I want to add synonyms feature, but I'm guessing I can not just add the
synonym filter to the language-specific analyzer, since the tokens that the
synonym filter would receive would be 'stemmed' versions.

I was thinking if there is anywhere I can see what exact combination of
filters the language analyzers are made of, I could plug the synonym filter
before the stemming.

Is that possible, am I just throwing stones here? (quite probably, these
are my first steps with elasticsearch), is there any other approax??

uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://windows7sins.org/


(José de Zárate) #2

Ok, I've found this link:
http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/example/solr/conf/schema.xml
although its painfully hard to review, looks like it shows what
'laguage-specific analyzers are really made of'. is elastic search using
this or there is another place to look at?

On Sunday, May 27, 2012 9:28:51 PM UTC-4, JoeZ99 wrote:

we use language-specifyc analyzer.
in my understanding, language-specific analyzer is a combination of
standard tokenizer plus customized stop words plus language-specific
stemming (I don't know if that last thing is done through snowball, though).

I want to add synonyms feature, but I'm guessing I can not just add the
synonym filter to the language-specific analyzer, since the tokens that the
synonym filter would receive would be 'stemmed' versions.

I was thinking if there is anywhere I can see what exact combination of
filters the language analyzers are made of, I could plug the synonym filter
before the stemming.

Is that possible, am I just throwing stones here? (quite probably, these
are my first steps with elasticsearch), is there any other approax??

uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://windows7sins.org/


(system) #3