ThomasL
(ThomasL)
October 16, 2013, 12:17pm
1
As we are not satisfied with the german snowball stemmer we are looking for
alternatives.
For example we miss stemming for some plural variants like: Kiwis --> Kiwi
/ Autos --> Auto and Nudeln --> Nudel etc.
We found out about Lucene's
GermanLightStemmer, see
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/analysis/de/GermanLightStemmer.html
and think this might be an alternative. At least we hope for…
I tried to use it in my elasticsearch settings, but without success so far.
Searching for "elasticsearch" and "GermanLightStemmer" results in too few
results either ;-/
Any hints how to use this stemmer in elasticsearch would really be
appreciated.
Also thanks in advance for infos about other alternative german stemmers
which can be used in elasticsearch and which are good at plural/singular
stemming.
Cheers,
Thomas
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
jprante
(Jörg Prante)
October 16, 2013, 12:41pm
2
The german light stemmer's name is 'light_german' and documented at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-stemmer-tokenfilter.html
Jörg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
ThomasL
(ThomasL)
October 18, 2013, 2:49pm
3
:)))
Thanks, trying this soon…
Am Mittwoch, 16. Oktober 2013 14:41:05 UTC+2 schrieb Jörg Prante:
The german light stemmer's name is 'light_german' and documented at
Elasticsearch Platform — Find real-time answers at scale | Elastic
Jörg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
ThomasL
(ThomasL)
October 21, 2013, 4:21pm
4
Thanks again, Jörg.
Tested short the "german light stemmer" does not work on "Autos/Auto" or
"Nudeln/Nudel".
What's the best approach to achieve this?
Using a "stemmer override token filter" as described below:
Power insights and outcomes with the Elasticsearch Platform and AI. See into your data and find answers that matter with enterprise solutions designed to help you build, observe, and protect. Try Elasticsearch free today.
?
Thanks again for all hints!
Thomas
Am Mittwoch, 16. Oktober 2013 14:41:05 UTC+2 schrieb Jörg Prante:
The german light stemmer's name is 'light_german' and documented at
Elasticsearch Platform — Find real-time answers at scale | Elastic
Jörg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .
jprante
(Jörg Prante)
October 21, 2013, 9:12pm
5
The best approach is to write a baseform plugin for german that is better
than simple algorithmic stemming
Fortunately, I have started one and have just released 1.0.0 which is based
on a lexicon.
Maybe you like to give it a try.
Jörg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
For more options, visit https://groups.google.com/groups/opt_out .