[Ann] Elasticsearch Langdetect plugin

Hi,

here is a small plugin for the upcoming 0.20 Elasticsearch release. It is a
language detector plugin, a revamped version of the implementation of
Nakatani Shuyo's language detector at
http://code.google.com/p/language-detection/

The plugin is available
at https://github.com/jprante/elasticsearch-langdetect

It offers a REST endpoint where a short text can be posted to in UTF-8, and
Elasticsearch responds with a list of recognized languages.

Currently, it does not provide automatic language-aware indexing. It is
just for your convenience. You have to evaluate the response by yourself
and take the appropriate language-dependent action.

Note: it does not work on Elasticsearch 0.19 because of API changes in the
REST action implementation.

Cheers and happy language detecting,

Jörg

--

I really love that one!

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 23 nov. 2012 à 02:21, Jörg Prante joergprante@gmail.com a écrit :

Hi,

here is a small plugin for the upcoming 0.20 Elasticsearch release. It is a language detector plugin, a revamped version of the implementation of Nakatani Shuyo's language detector at http://code.google.com/p/language-detection/

The plugin is available at https://github.com/jprante/elasticsearch-langdetect
It offers a REST endpoint where a short text can be posted to in UTF-8, and Elasticsearch responds with a list of recognized languages.

Currently, it does not provide automatic language-aware indexing. It is just for your convenience. You have to evaluate the response by yourself and take the appropriate language-dependent action.

Note: it does not work on Elasticsearch 0.19 because of API changes in the REST action implementation.

Cheers and happy language detecting,

Jörg

--

--

Thanks Jorg!

You are a great producer of elasticsearch's plugins :slight_smile:

-- Tanguy
Twitter: @tlrx

Le vendredi 23 novembre 2012 02:21:59 UTC+1, Jörg Prante a écrit :

Hi,

here is a small plugin for the upcoming 0.20 Elasticsearch release. It is
a language detector plugin, a revamped version of the implementation of
Nakatani Shuyo's language detector at
Google Code Archive - Long-term storage for Google Code Project Hosting.

The plugin is available at
GitHub - jprante/elasticsearch-langdetect: A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detector

It offers a REST endpoint where a short text can be posted to in UTF-8,
and Elasticsearch responds with a list of recognized languages.

Currently, it does not provide automatic language-aware indexing. It is
just for your convenience. You have to evaluate the response by yourself
and take the appropriate language-dependent action.

Note: it does not work on Elasticsearch 0.19 because of API changes in the
REST action implementation.

Cheers and happy language detecting,

Jörg

--

Jörg should think of opening a JorgPranteAppStore (with an ES instance to search for plugins)

:wink:

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 23 nov. 2012 à 09:28, Tanguy tlrx.dev@gmail.com a écrit :

Thanks Jorg!

You are a great producer of elasticsearch's plugins :slight_smile:

-- Tanguy
Twitter: @tlrx
tlrx (Tanguy Leroux) · GitHub

Le vendredi 23 novembre 2012 02:21:59 UTC+1, Jörg Prante a écrit :
Hi,

here is a small plugin for the upcoming 0.20 Elasticsearch release. It is a language detector plugin, a revamped version of the implementation of Nakatani Shuyo's language detector at Google Code Archive - Long-term storage for Google Code Project Hosting.

The plugin is available at GitHub - jprante/elasticsearch-langdetect: A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detector
It offers a REST endpoint where a short text can be posted to in UTF-8, and Elasticsearch responds with a list of recognized languages.

Currently, it does not provide automatic language-aware indexing. It is just for your convenience. You have to evaluate the response by yourself and take the appropriate language-dependent action.

Note: it does not work on Elasticsearch 0.19 because of API changes in the REST action implementation.

Cheers and happy language detecting,

Jörg

--

--