[ANNOUNCEMENT] Elasticsearch analysis lemmagen plugin update

Hi,

back in 2013 I wrote plugin which provides jLemmaGen lematizer with some prebuilt lexicons as elasticsearch token filter. As it turned out, lexicon license was very restrictive. The plugin was usable only for non-commercial research projects. You can take a look at the original thread [ANN] LemmaGen Analysis for ElasticSearch plugin.

Some time ago I found that source data MULTEXT-East free lexicons 4.0 are distributed under Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) which I believe means, that we can generate lexicons from this source, publish them with the same license (CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/) and use them with the plugin.

For this reason I removed built-in lexicons from the plugin (beginning with the plugin v6.0.0) and prepared separate repository for the lexicons.

  • free lexicons (CC BY-SA 4.0)
    • Bulgarian
    • Czech
    • English
    • Estonian
    • French
    • Hungarian
    • Romanian
    • Slovak
    • Resian (sl dialect)
    • Slovene
    • Ukrainian
  • non-free lexicons (CC BY-NC 4.0)
    • Farsi / Persian
    • Macedonian
    • Polish
    • Russian
    • Serbian

I also updated plugin to work well with (almost) all elasticsearch 5.x and 6.x versions. But with the beginning of the version 6.0.0 there is need to download particular lexicon from lexicons repository.

I believe this breaking change will allow us to use this elasticsearch plugin even for commercial projects (with the free lexicons).

More information can be found at elasticsearch-analysis-lemmagen and lemmagen-lexicons repositories.

Regards,
Vojta

Version for the new elasticsearch 6.3.0 released https://github.com/vhyza/elasticsearch-analysis-lemmagen/releases/tag/v6.3.0

Version for the elasticsearch 7.7.0 released as 100th release of the plugin :partying_face: - https://github.com/vhyza/elasticsearch-analysis-lemmagen/releases/tag/v7.7.0

Also the elasticsearch version 6.8.9 got its own release - https://github.com/vhyza/elasticsearch-analysis-lemmagen/releases/tag/v6.8.9

1 Like