we use language-specifyc analyzer.
in my understanding, language-specific analyzer is a combination of
standard tokenizer plus customized stop words plus language-specific
stemming (I don't know if that last thing is done through snowball, though).
I want to add synonyms feature, but I'm guessing I can not just add the
synonym filter to the language-specific analyzer, since the tokens that the
synonym filter would receive would be 'stemmed' versions.
I was thinking if there is anywhere I can see what exact combination of
filters the language analyzers are made of, I could plug the synonym filter
before the stemming.
Is that possible, am I just throwing stones here? (quite probably, these
are my first steps with elasticsearch), is there any other approax??
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.
http://windows7sins.org/
Ok, I've found this link:
http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/example/solr/conf/schema.xml
although its painfully hard to review, looks like it shows what
'laguage-specific analyzers are really made of'. is Elasticsearch using
this or there is another place to look at?
On Sunday, May 27, 2012 9:28:51 PM UTC-4, JoeZ99 wrote:
we use language-specifyc analyzer.
in my understanding, language-specific analyzer is a combination of
standard tokenizer plus customized stop words plus language-specific
stemming (I don't know if that last thing is done through snowball, though).
I want to add synonyms feature, but I'm guessing I can not just add the
synonym filter to the language-specific analyzer, since the tokens that the
synonym filter would receive would be 'stemmed' versions.
I was thinking if there is anywhere I can see what exact combination of
filters the language analyzers are made of, I could plug the synonym filter
before the stemming.
Is that possible, am I just throwing stones here? (quite probably, these
are my first steps with elasticsearch), is there any other approax??
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.
http://windows7sins.org/