Optimize corpus before training

Hi all from elastic team.

I am building a pipiline to optimize a corpus before submitting to opennlp training.

  1. remove stopwords
  2. apply lemmatizer or stemmer

The elastic has some functionality that I pass a text and returns the text without stopwords and with lemmatizer or stemmer applied.

Best regards and Happy New Year !!!!

