RSLP Stemmer (Brazilian Portuguese) plugin 1.0.0


(Anael Carvalho) #1

This is an implementation of the RSLP stemmer algorithm as described in "A Stemming Algorithm for the Portuguese Language" by Orengo, V.M. and Huyck, C.

https://github.com/anaelcarvalho/elasticsearch-analysis-rslp

A new token filter named 'br_rslp' is made available for use with custom filters and analyzers.

Version 1.0.0 requires Elasticsearch 2.0.0.

Sample configuration:

index:
  analysis:
    filter:
      my_stemmer:
        type: br_rslp

Feedback is very welcome - thanks!


(Diego Bernardes) #2

Nice.

Whats the main difference from the actual stemmer?
I'm having some problems with the default stemmer, words like 'capinha' get stemmed to 'capinh' instead of 'cap', this is giving me false positives, this plugin solve this problems?


just tested the plugin, it works very well, solved my problem, much better than the lucene stemmer.
thanks!


(Anael Carvalho) #3

Updated to version 2.0.0 for compatibility with ES 5.0.


(Patrick Domenico Antonioli Ferraro) #4

Hi. In elastic 5.4 it´s not installing. cheers.

What is a good setup for Pt-Br search? Analysers, filters, etc...

cheers