Hi,
Both the indexer support and an initial indexer implementation for
twitter sample stream are now at master. Index information can be found
here: http://github.com/elasticsearch/elasticsearch/issues/closed#issue/377.
The twitter indexer plugin here:
http://github.com/elasticsearch/elasticsearch/issues/closed#issue/378.
I encourage you to head over and read it. I had a great Eureka moment
where I decided to store the indexers meta and state information as another
index in elasticsearch (did someone say eating your own god food? ) which
really means the indexer is very open and have a good state persistance api
(the elasticsearch API).
The indexer itself is pretty open, here is the twitter indexer (pretty
simple):
http://github.com/elasticsearch/elasticsearch/blob/master/plugins/indexer/twitter/src/main/java/org/elasticsearch/indexer/twitter/TwitterIndexer.java
.
Some of my plans are to provide for polygot indexers (write your own
indexer in groovy, ruby), but since it will probably require the
elasticsearch Client API, I would love to first get proper support for the
JVM lang (like it is with the groovy case). Also, I have some ideas for more
indexers implementation like wikipedia, rabbitmq, JMS, redis, couchdb, and
others.
As a side note, the sample twitter stream API is pretty slow (in
elasticsearch terms) so you can easily run it on your laptop without it
breaking a sweat at all ;). Would have loved to test it with the firehose...
.
-shay.banon