elasticsearch just got plugins support, with the first plugin already
implemented. The attachments plugin adds a mapper type of attachment, which
allows to parse different formats and index them (html, pdf, docs, ...). The
issue and the explanation is here:
http://github.com/elasticsearch/elasticsearch/issues/issue/92.
Why is the attachments support delivered as a plugin? The main reason is
that I want to keep the elasticsearch installation as small as possible,
with mainly core features. The attachments plugins weights at 13mb... .
Also, I hope that more plugins will start to be developed (elasticsearch is
highly modular and pluggable) though the internal SPI is still going through
changes.
Since I do consider the attachments plugin semi core, it is within the
elasticsearch/elasticsearch repo. Other plugins might be developers as
completely separate project to (hopefully) encourage participation.
My head bursts from all the ideas for plugins that I have. Would love to
also hear what you think? For example, two of my main plugins I wish to
develop is the cloud plugin (better integration with different cloud
providers), and different "nosql" plugins (automatically index couchdb,
cassandra, voldermont, ...)
-shay.banon