ElasticSearch + Cassandra

Hi Everyone,

Has anyone done/attempted to integrate ElasticSearch with Cassandra
like was described at http://www.elasticsearch.org/blog/2010/02/25/nosql_yessearch.html
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

Not that I am aware... .
On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described at Elasticsearch Platform — Find real-time answers at scale | Elastic
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

It's not based on Elasticsearch, but tjake has the Solandra integration:

On 5 April 2011 20:44, Shay Banon shay.banon@elasticsearch.com wrote:

Not that I am aware... .

On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described at
Elasticsearch Platform — Find real-time answers at scale | Elastic
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

Its not what was asked for. The question was for something to automatically index changes done to cassandra. Solandra tries to build solr on top of cassandra to make it better distributed, but it combines then two problems instead of one: using solr (distributed execution) and trying to hack Lucene to work on top of cassandra which is problematic and probably bad in most cases.
On Wednesday, April 6, 2011 at 6:41 AM, Paul Smith wrote:

It's not based on Elasticsearch, but tjake has the Solandra integration:

GitHub - tjake/Solandra: Solandra = Solr + Cassandra

On 5 April 2011 20:44, Shay Banon shay.banon@elasticsearch.com wrote:

Not that I am aware... .
On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described at Elasticsearch Platform — Find real-time answers at scale | Elastic
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

Just as a side note.

Maybe it is a bit unfair to compare Solandra with Elasticsearch.
Solandra looks pretty cool and interesting.

3 months ago I did a minor evaluation for my project and Solandra was
in a useful state.
BUT as it turns out performance and memory usage of ES (or pure solr)
was far better on a single machine than Solandra.
Also Solandra had some bugs (jvm crashed, facets didn't work) which
then were fixed I think.

Also ES is a lot cleaner. In terms of technologies involved and in
terms of written unit tests. If you look into the test folder of
solandra it is pretty much empty.

Then Solandra is 1 year old, where Elasticsearch is built from an
author with concepts of the well known, several years old project
'Compass' in mind :slight_smile:

Regards,
Peter.

On 6 Apr., 11:08, Shay Banon shay.ba...@elasticsearch.com wrote:

Its not what was asked for. The question was for something to automatically index changes done to cassandra. Solandra tries to build solr on top of cassandra to make it better distributed, but it combines then two problems instead of one: using solr (distributed execution) and trying to hack Lucene to work on top of cassandra which is problematic and probably bad in most cases.

On Wednesday, April 6, 2011 at 6:41 AM, Paul Smith wrote:

It's not based on Elasticsearch, but tjake has the Solandra integration:

GitHub - tjake/Solandra: Solandra = Solr + Cassandra

On 5 April 2011 20:44, Shay Banon shay.ba...@elasticsearch.com wrote:

Not that I am aware... .
On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described athttp://www.elasticsearch.org/blog/2010/02/25/nosql_yessearch.html
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

Yes, agreed, Solandra and Lucandra are pretty cool in terms of concepts, but I think they lack when it comes to how they use Lucene (putting the solr aspect aside).

The problem (and its based on my last check in how the Lucene on top of Cassandra implementation was done) is how the reader/searcher works on top of Cassandra. First, it bypass a lot of optimizations lucene has in terms of how it stores/fetches data in its reader/searcher implementation, and uses Cassandra to load that info. The more problematic nature is how lucene uses things like FieldCache and caches based on readers, which is problematic when using the reader/searcher based on Lucandra (sorting and facets for example). This can cause very severe performance/memory problems.

Again, this is based on my overview of the code. If I am missing something, I would love to be corrected. Not here to spread FUD or anything.

-shay.banon
On Wednesday, April 6, 2011 at 3:14 PM, Karussell wrote:

Just as a side note.

Maybe it is a bit unfair to compare Solandra with Elasticsearch.
Solandra looks pretty cool and interesting.

3 months ago I did a minor evaluation for my project and Solandra was
in a useful state.
BUT as it turns out performance and memory usage of ES (or pure solr)
was far better on a single machine than Solandra.
Also Solandra had some bugs (jvm crashed, facets didn't work) which
then were fixed I think.

Also ES is a lot cleaner. In terms of technologies involved and in
terms of written unit tests. If you look into the test folder of
solandra it is pretty much empty.

Then Solandra is 1 year old, where Elasticsearch is built from an
author with concepts of the well known, several years old project
'Compass' in mind :slight_smile:

Regards,
Peter.

On 6 Apr., 11:08, Shay Banon shay.ba...@elasticsearch.com wrote:

Its not what was asked for. The question was for something to automatically index changes done to cassandra. Solandra tries to build solr on top of cassandra to make it better distributed, but it combines then two problems instead of one: using solr (distributed execution) and trying to hack Lucene to work on top of cassandra which is problematic and probably bad in most cases.

On Wednesday, April 6, 2011 at 6:41 AM, Paul Smith wrote:

It's not based on Elasticsearch, but tjake has the Solandra integration:

GitHub - tjake/Solandra: Solandra = Solr + Cassandra

On 5 April 2011 20:44, Shay Banon shay.ba...@elasticsearch.com wrote:

Not that I am aware... .
On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described athttp://www.elasticsearch.org/blog/2010/02/25/nosql_yessearch.html
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

On 6 Apr., 14:21, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, agreed, Solandra and Lucandra are pretty cool in terms of concepts, but I think they lack when it comes to how they use Lucene (putting the solr aspect aside).

The problem (and its based on my last check in how the Lucene on top of Cassandra implementation was done)

@tjake rewrote the integration from scratch for the solandra version,
so not sure if this still applies.

Not here to spread FUD or anything.

yes, of course. Maybe we should invite him to get a real
conversation :slight_smile:

On Wednesday, April 6, 2011 at 11:40 PM, Karussell wrote:

On 6 Apr., 14:21, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, agreed, Solandra and Lucandra are pretty cool in terms of concepts, but I think they lack when it comes to how they use Lucene (putting the solr aspect aside).

The problem (and its based on my last check in how the Lucene on top of Cassandra implementation was done)

@tjake rewrote the integration from scratch for the solandra version,
so not sure if this still applies.

As far as I can see it still suffers from the reader problem mentioned: https://github.com/tjake/Solandra/blob/solandra/src/lucandra/IndexReader.java. The main problem is that it looses the whole immutable (up to deletes) segment reader based concept in lucene (on top of other low level optimizations in how lucene readers work).
Not here to spread FUD or anything.

yes, of course. Maybe we should invite him to get a real
conversation :slight_smile:

On Apr 6, 7:21 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, agreed, Solandra and Lucandra are pretty cool in terms of concepts, but I think they lack when it comes to how they use Lucene (putting the solr aspect aside).

The problem (and its based on my last check in how the Lucene on top of Cassandra implementation was done) is how the reader/searcher works on top of Cassandra. First, it bypass a lot of optimizations lucene has in terms of how it stores/fetches data in its reader/searcher implementation, and uses Cassandra to load that info.

You mean in terms of file format correct? It's true Lucene format is
much better here but Cassandra will get to this. In the meantime the
benefit of the data being truly masterless and scalable (compared to a
tech like solr).

The more problematic nature is how lucene uses things like FieldCache and caches based on readers, which is problematic when using the reader/searcher based on Lucandra (sorting and facets for example). This can cause very severe performance/memory problems.

Solandra breaks the index into manageable shards so this is less of a
problem than it was in Lucandra. The solution for this long term
comes when Cassandra trigger support comes in, then it will be able to
update the FieldCache's without invalidating them (since Solandra
allocates a fixed space of documents that can be filled in over
time). I haven't taken a deep look at how ES handles this problem but
I'd love a breif description if you have a sec Shay.

Again, this is based on my overview of the code. If I am missing something, I would love to be corrected. Not here to spread FUD or anything.

I don't see any FUD here. Simply that Solandra is taking a more
fundamental approach to handling distributed search by dealing with it
at the file format level... Elasticsearch has build distributed search
ontop of Lucene and added a number of great features and service
layer.

Solandras strongest use case is for those who need potentially
millions of smaller indexes since it's not creating a directory per
index under the hood but rather using composite keys in Cassandra.

-Jake

-shay.banon

On Wednesday, April 6, 2011 at 3:14 PM, Karussell wrote:

Just as a side note.

Maybe it is a bit unfair to compare Solandra with Elasticsearch.
Solandra looks pretty cool and interesting.

3 months ago I did a minor evaluation for my project and Solandra was
in a useful state.
BUT as it turns out performance and memory usage of ES (or pure solr)
was far better on a single machine than Solandra.
Also Solandra had some bugs (jvm crashed, facets didn't work) which
then were fixed I think.

Also ES is a lot cleaner. In terms of technologies involved and in
terms of written unit tests. If you look into the test folder of
solandra it is pretty much empty.

Then Solandra is 1 year old, where Elasticsearch is built from an
author with concepts of the well known, several years old project
'Compass' in mind :slight_smile:

Regards,
Peter.

On 6 Apr., 11:08, Shay Banon shay.ba...@elasticsearch.com wrote:

Its not what was asked for. The question was for something to automatically index changes done to cassandra. Solandra tries to build solr on top of cassandra to make it better distributed, but it combines then two problems instead of one: using solr (distributed execution) and trying to hack Lucene to work on top of cassandra which is problematic and probably bad in most cases.

On Wednesday, April 6, 2011 at 6:41 AM, Paul Smith wrote:

It's not based on Elasticsearch, but tjake has the Solandra integration:

GitHub - tjake/Solandra: Solandra = Solr + Cassandra

On 5 April 2011 20:44, Shay Banon shay.ba...@elasticsearch.com wrote:

Not that I am aware... .
On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described athttp://www.elasticsearch.org/blog/2010/02/25/nosql_yessearch.html
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

Heya, great you jumped in :), answers below:
On Thursday, April 7, 2011 at 12:02 AM, tjake wrote:

On Apr 6, 7:21 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, agreed, Solandra and Lucandra are pretty cool in terms of concepts, but I think they lack when it comes to how they use Lucene (putting the solr aspect aside).

The problem (and its based on my last check in how the Lucene on top of Cassandra implementation was done) is how the reader/searcher works on top of Cassandra. First, it bypass a lot of optimizations lucene has in terms of how it stores/fetches data in its reader/searcher implementation, and uses Cassandra to load that info.

You mean in terms of file format correct? It's true Lucene format is
much better here but Cassandra will get to this.
Yes, the intimacy with how Lucene works with the data structures it generates and then searches on is a great boon for performance. Cassandra is blazing fast, but its hard to believe it can catch up to a system that creates the data structures it knows it is going to consume.
In the meantime the
benefit of the data being truly masterless and scalable (compared to a
tech like solr).
Agreed, thats a real benefit. The main problem here though is that Solandra still needs to work with solr distributed search support :slight_smile:

The more problematic nature is how lucene uses things like FieldCache and caches based on readers, which is problematic when using the reader/searcher based on Lucandra (sorting and facets for example). This can cause very severe performance/memory problems.

Solandra breaks the index into manageable shards so this is less of a
problem than it was in Lucandra.
Right, but how big can a single shard be? The more you have, the more problems you run into with how Solr executes distributed search (blocking IO, http, and all the other bits). And with big shards, having to reload all the caches once you want to see changes is very costly.
The solution for this long term
comes when Cassandra trigger support comes in, then it will be able to
update the FieldCache's without invalidating them (since Solandra
allocates a fixed space of documents that can be filled in over
time).
FieldCache is one (very good) example. Other includes filter caching, or any reader cache base constructs. Those are used heavily in advance search systems and become harder to solve with trigger based approach.
I haven't taken a deep look at how ES handles this problem but
I'd love a breif description if you have a sec Shay.
Sure, no secret sauce here, it basically uses the same logic as Lucene FIeldCache, using the segment cache key to cache the data (like field level cache). Those are immutable (not evicted because of deletes, mind you).

Again, this is based on my overview of the code. If I am missing something, I would love to be corrected. Not here to spread FUD or anything.

I don't see any FUD here. Simply that Solandra is taking a more
fundamental approach to handling distributed search by dealing with it
at the file format level... Elasticsearch has build distributed search
ontop of Lucene and added a number of great features and service
layer.
Its certainly a cool way to try and solve it. I have been there, trying to do the same starting from custom Directory implementations to custom readers mainly optimized to use Data Grids (with collocation), elasticsearch, at least for me, is the next step that I took from there, mainly letting Lucene do what it does best, and try and wrap it in the best way possible :).

Term based Vs. Document based partitioning has been a great question to ask when trying to build distributed search engines. I think that is has basically been proven, at least from the practical sense, that document based partitioning is the way to go (watch this on google doing it: Challenges in Building Large-Scale Information Retrieval Systems - VideoLectures.NET, at 9:44 mark).

This is, by the way, why riak search, though another really cool technology, is very very problematic. They do term based partitioning, and its a slippery road from there (think of implementing something like facets in a collocated manner with term based partitioning, something called by doc data).

Though Solandra does not strictly use term based partitioning, it still replicate based on terms (I think?).

Solandras strongest use case is for those who need potentially
millions of smaller indexes since it's not creating a directory per
index under the hood but rather using composite keys in Cassandra.
Agreed, thats a boon. In elasticsearch there is no reason why you could not create million of indices, but the overhead of each shard Lucene wise is substantial (starting with file handles...).

At least for those cases, elasticsearch wise, the idea that you can control routing when indexing and searching allows you to create index with N number of shards, and index all those docs with the a qualifier (that also controls routing) and be able to filter only the ones you want out.

-Jake

-shay.banon

On Wednesday, April 6, 2011 at 3:14 PM, Karussell wrote:

Just as a side note.

Maybe it is a bit unfair to compare Solandra with Elasticsearch.
Solandra looks pretty cool and interesting.

3 months ago I did a minor evaluation for my project and Solandra was
in a useful state.
BUT as it turns out performance and memory usage of ES (or pure solr)
was far better on a single machine than Solandra.
Also Solandra had some bugs (jvm crashed, facets didn't work) which
then were fixed I think.

Also ES is a lot cleaner. In terms of technologies involved and in
terms of written unit tests. If you look into the test folder of
solandra it is pretty much empty.

Then Solandra is 1 year old, where Elasticsearch is built from an
author with concepts of the well known, several years old project
'Compass' in mind :slight_smile:

Regards,
Peter.

On 6 Apr., 11:08, Shay Banon shay.ba...@elasticsearch.com wrote:

Its not what was asked for. The question was for something to automatically index changes done to cassandra. Solandra tries to build solr on top of cassandra to make it better distributed, but it combines then two problems instead of one: using solr (distributed execution) and trying to hack Lucene to work on top of cassandra which is problematic and probably bad in most cases.

On Wednesday, April 6, 2011 at 6:41 AM, Paul Smith wrote:

It's not based on Elasticsearch, but tjake has the Solandra integration:

GitHub - tjake/Solandra: Solandra = Solr + Cassandra

On 5 April 2011 20:44, Shay Banon shay.ba...@elasticsearch.com wrote:

Not that I am aware... .
On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described athttp://www.elasticsearch.org/blog/2010/02/25/nosql_yessearch.html
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

Thanks, updated below...

On Wed, Apr 6, 2011 at 9:43 PM, Shay Banon shay.banon@elasticsearch.comwrote:

Heya, great you jumped in :), answers below:

On Thursday, April 7, 2011 at 12:02 AM, tjake wrote:

On Apr 6, 7:21 am, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, agreed, Solandra and Lucandra are pretty cool in terms of concepts,
but I think they lack when it comes to how they use Lucene (putting the solr
aspect aside).

The problem (and its based on my last check in how the Lucene on top of
Cassandra implementation was done) is how the reader/searcher works on top
of Cassandra. First, it bypass a lot of optimizations lucene has in terms of
how it stores/fetches data in its reader/searcher implementation, and uses
Cassandra to load that info.

You mean in terms of file format correct? It's true Lucene format is
much better here but Cassandra will get to this.

Yes, the intimacy with how Lucene works with the data structures it
generates and then searches on is a great boon for performance. Cassandra is
blazing fast, but its hard to believe it can catch up to a system that
creates the data structures it knows it is going to consume.

I guess the limitation is if you can't represent the data in a column family
style. I don't see anything in the lucene file format that isn't equally
representable. With Solandra on a single node is basically the same search
performance for solr with facets and sorting.

In the meantime the
benefit of the data being truly masterless and scalable (compared to a
tech like solr).

Agreed, thats a real benefit. The main problem here though is that Solandra
still needs to work with solr distributed search support :slight_smile:

It's not that bad the binary format is ok, the only problem is when you
want to jump to result 100k it resends all that results. (not sure if 3.1
has addressed this...) Also there is no reason why Solandra has to only use
Solr only. I could picture a ES service layer on Cassandra as well :slight_smile: It
seems much of what you've built is similar to cassandra in many ways with
write consistency and live scaling, replication. But my goal right now is to
provide true distributed to all the solr users out there...

The more problematic nature is how lucene uses things like FieldCache and

caches based on readers, which is problematic when using the reader/searcher
based on Lucandra (sorting and facets for example). This can cause very
severe performance/memory problems.

Solandra breaks the index into manageable shards so this is less of a
problem than it was in Lucandra.

Right, but how big can a single shard be? The more you have, the more
problems you run into with how Solr executes distributed search (blocking
IO, http, and all the other bits). And with big shards, having to reload all
the caches once you want to see changes is very costly.

The solution for this long term
comes when Cassandra trigger support comes in, then it will be able to
update the FieldCache's without invalidating them (since Solandra
allocates a fixed space of documents that can be filled in over
time).

FieldCache is one (very good) example. Other includes filter caching, or
any reader cache base constructs. Those are used heavily in advance search
systems and become harder to solve with trigger based approach.

Right, but this is the plan of attack to handle not needing to invalidate
the caches as often.

I haven't taken a deep look at how ES handles this problem but
I'd love a breif description if you have a sec Shay.

Sure, no secret sauce here, it basically uses the same logic as Lucene
FIeldCache, using the segment cache key to cache the data (like field level
cache). Those are immutable (not evicted because of deletes, mind you).

Again, this is based on my overview of the code. If I am missing something,
I would love to be corrected. Not here to spread FUD or anything.

I don't see any FUD here. Simply that Solandra is taking a more
fundamental approach to handling distributed search by dealing with it
at the file format level... Elasticsearch has build distributed search
ontop of Lucene and added a number of great features and service
layer.

Its certainly a cool way to try and solve it. I have been there, trying to
do the same starting from custom Directory implementations to custom readers
mainly optimized to use Data Grids (with collocation), elasticsearch, at
least for me, is the next step that I took from there, mainly letting Lucene
do what it does best, and try and wrap it in the best way possible :).

Term based Vs. Document based partitioning has been a great question to ask
when trying to build distributed search engines. I think that is has
basically been proven, at least from the practical sense, that document
based partitioning is the way to go (watch this on google doing it:
Challenges in Building Large-Scale Information Retrieval Systems - VideoLectures.NET, at 9:44 mark).

This is, by the way, why riak search, though another really cool
technology, is very very problematic. They do term based partitioning, and
its a slippery road from there (think of implementing something like facets
in a collocated manner with term based partitioning, something called by doc
data).

Though Solandra does not strictly use term based partitioning, it still
replicate based on terms (I think?).

Lucandra partitioned on terms and this was flawed. Solandra uses Document
partitions so a chunk of documents are available locally to search from this
minimized the cross node IO. A later feature will be partitioned indexes
based on a field like time windows...

Solandras strongest use case is for those who need potentially
millions of smaller indexes since it's not creating a directory per
index under the hood but rather using composite keys in Cassandra.

Agreed, thats a boon. In elasticsearch there is no reason why you could not
create million of indices, but the overhead of each shard Lucene wise is
substantial (starting with file handles...).

At least for those cases, elasticsearch wise, the idea that you can control
routing when indexing and searching allows you to create index with N number
of shards, and index all those docs with the a qualifier (that also controls
routing) and be able to filter only the ones you want out.

-Jake

-shay.banon

On Wednesday, April 6, 2011 at 3:14 PM, Karussell wrote:

Just as a side note.

Maybe it is a bit unfair to compare Solandra with Elasticsearch.
Solandra looks pretty cool and interesting.

3 months ago I did a minor evaluation for my project and Solandra was
in a useful state.
BUT as it turns out performance and memory usage of ES (or pure solr)
was far better on a single machine than Solandra.
Also Solandra had some bugs (jvm crashed, facets didn't work) which
then were fixed I think.

Also ES is a lot cleaner. In terms of technologies involved and in
terms of written unit tests. If you look into the test folder of
solandra it is pretty much empty.

Then Solandra is 1 year old, where Elasticsearch is built from an
author with concepts of the well known, several years old project
'Compass' in mind :slight_smile:

Regards,
Peter.

On 6 Apr., 11:08, Shay Banon shay.ba...@elasticsearch.com wrote:

Its not what was asked for. The question was for something to automatically
index changes done to cassandra. Solandra tries to build solr on top of
cassandra to make it better distributed, but it combines then two problems
instead of one: using solr (distributed execution) and trying to hack Lucene
to work on top of cassandra which is problematic and probably bad in most
cases.

On Wednesday, April 6, 2011 at 6:41 AM, Paul Smith wrote:

It's not based on Elasticsearch, but tjake has the Solandra integration:

GitHub - tjake/Solandra: Solandra = Solr + Cassandra

On 5 April 2011 20:44, Shay Banon shay.ba...@elasticsearch.com wrote:

Not that I am aware... .
On Monday, April 4, 2011 at 8:12 PM, Drew wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described athttp://
Elasticsearch Platform — Find real-time answers at scale | Elastic
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew

--
http://twitter.com/tjake

Drew, I'm also looking for something similar.

Using cassandra as the DB and Elasticsearch as the index to search
through this DB.

I'm planning to explore the use of an "ORM"-like for cassandra coupled
with Elasticsearch.
Because I'm programming in Ruby, I'll probably use Escargot for ES
(GitHub - angelf/escargot: ElasticSearch connector for Rails - Abandoned!) and extend/wrap fauna (https://
github.com/fauna/cassandra) for Cassandra.

M.

On Apr 4, 1:12 pm, Drew d...@venarc.com wrote:

Hi Everyone,

Has anyone done/attempted to integrate Elasticsearch with Cassandra
like was described athttp://www.elasticsearch.org/blog/2010/02/25/nosql_yessearch.html
?

I don't mean using Cassandra as a Gateway for ES, but using ES to
query the data in Cassandra automatically.

Thanks,

Drew