[Just Pushed]: plugins are here, and the attachments plugin is the first one

elasticsearch just got plugins support, with the first plugin already
implemented. The attachments plugin adds a mapper type of attachment, which
allows to parse different formats and index them (html, pdf, docs, ...). The
issue and the explanation is here:
http://github.com/elasticsearch/elasticsearch/issues/issue/92.

Why is the attachments support delivered as a plugin? The main reason is
that I want to keep the elasticsearch installation as small as possible,
with mainly core features. The attachments plugins weights at 13mb... .
Also, I hope that more plugins will start to be developed (elasticsearch is
highly modular and pluggable) though the internal SPI is still going through
changes.

Since I do consider the attachments plugin semi core, it is within the
elasticsearch/elasticsearch repo. Other plugins might be developers as
completely separate project to (hopefully) encourage participation.

My head bursts from all the ideas for plugins that I have. Would love to
also hear what you think? For example, two of my main plugins I wish to
develop is the cloud plugin (better integration with different cloud
providers), and different "nosql" plugins (automatically index couchdb,
cassandra, voldermont, ...)

-shay.banon

On Mon, Mar 29, 2010 at 12:08 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

elasticsearch just got plugins support, with the first plugin already
implemented.

Congratulations, very cool!
Are there any (java)docs about plugin APIs?
How will such an API be delivered? Will it be part of the main
elasticsearch distribution, or rather a separate jar?

My head bursts from all the ideas for plugins that I have. Would love to
also hear what you think? For example, two of my main plugins I wish to
develop is the cloud plugin (better integration with different cloud
providers), and different "nosql" plugins (automatically index couchdb,
cassandra, voldermont, ...)

Regarding nosql related plugins, there are imho two kind of integrations:

  1. The one you mentioned, for automatically indexing nosql documents.
  2. Using a nosql store as an elasticsearch gateway for long term persistence.

WDYT?

--
Sergio Bossa
http://www.linkedin.com/in/sergiob

On Mon, Mar 29, 2010 at 1:27 PM, Sergio Bossa sergio.bossa@gmail.comwrote:

On Mon, Mar 29, 2010 at 12:08 PM, Shay Banon
shay.banon@elasticsearch.com wrote:

elasticsearch just got plugins support, with the first plugin already
implemented.

Congratulations, very cool!

Thanks!

Are there any (java)docs about plugin APIs?

Very minor :). A plugin basically revolves around injecting your code. The
creations of the code is done using Guice (you support Module(s)), and
lifecycle services. In elasticsearch, there is a NodeServers->(Index
Services)->
(Index Shard Services). For example, the discovery is a node
service, the analysis is an index module, and an actual lucene index store
is an Index Shard Service. With the plugin you can plug into any of this
levels.

How will such an API be delivered? Will it be part of the main
elasticsearch distribution, or rather a separate jar?

The idea is that its a simple zip file containing all the jar files if your
plugin. You place it under the plugins directory and I automatically add it
to the classpath. When using elasticsearch embedded in in your java code,
you can use the same manner (control the plugin directory location) or
simply add the plugin jar files into the classpath, both will work.

My head bursts from all the ideas for plugins that I have. Would love to
also hear what you think? For example, two of my main plugins I wish to
develop is the cloud plugin (better integration with different cloud
providers), and different "nosql" plugins (automatically index couchdb,
cassandra, voldermont, ...)

Regarding nosql related plugins, there are imho two kind of integrations:

  1. The one you mentioned, for automatically indexing nosql documents.
  2. Using a nosql store as an elasticsearch gateway for long term
    persistence.

Agreed!, storing gateway can be done on top of cassandra for example, HDFS,
or S3, CloudFiles and so on.

WDYT?

--
Sergio Bossa
http://www.linkedin.com/in/sergiob