I'm implementing a search process using ElasticSearch which currently use
the snowball token filter (French). I have take a look to the Stemmer token
filter which seems to do the same. Someone can explain what is the
difference between the stemmer token filter & the snowball token filter and
so, what is the difference between these stemmer configurations: french,
light_french, minimal_french?
Rule of thumb: use this stemmer if you want a specifically designed stemmer
for french with the help of a stopword list
the light stemmer, a statistical approach of stemming applicable for
several languages, based on the algorithm of Jacques Savoy "Light Stemming
Approaches for the French, Portuguese,
German and Hungarian Languages" Attention - RERO DOC
Rule of thumb: use this stemmer if you prefer statistical methods that
should keep good retrieval quality with even less stemmed words
Cheers, Jörg
On Friday, October 26, 2012 4:17:04 PM UTC+2, Eric GeLo wrote:
Hey!
I'm implementing a search process using Elasticsearch which currently use
the snowball token filter (French). I have take a look to the Stemmer token
filter which seems to do the same. Someone can explain what is the
difference between the stemmer token filter & the snowball token filter and
so, what is the difference between these stemmer configurations: french,
light_french, minimal_french?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.