Using Cassandra in addition to Elasticsearch

Hi there,
as far as I understand Elastic has matured enough by now to be used as a primary data store. Data losses are getting less and less likely and the possibility of freezing indexes introduced in version 6.6 allows for a bigger index to heap ratio which translates to a higher storage capacity for the same price.

Still I'm reading about things like Elassandra (using Apache Cassandra as datastore and Elastic as a secondary index). What are use cases for projects like this given the above (assuming I'm not interested in using Cassandra's query capabilities)? Specifically, I read about three advantages of Elassandra:

  • Apache Cassandra's masterless architecture is more stable than Elasticsearch's master-slave-architecture.
  • Using Elassandra the same document can be indexed multiple times without using up more space on the disk.
  • Elassandra's automatic resharding capabilities (inherited from Apache Cassandra?) allow for easy scaling.

How relevant are these points nowadays and what other (dis)advantages does it have to use Apache Cassandra (or something else) as a primary datastore (and indexing the stored data with Elastic) compared to using Elasticsearch both for storage and for search? For example, what are the differences in performance and disk space needed between these setups?

Thanks for your answers and any sources you can point me to!

Claims like "more stable" and "easy scaling" are hard to pin down TBH, but stock Elasticsearch is pretty stable and pretty easy to scale already. I don't understand the point about indexing the same document multiple times, can you share a reference?

On the other hand it's a fork of Elasticsearch and as of now the latest Elassandra is based on Elasticsearch 6.2.3 so is missing nearly a year's worth of features and enhancements on the Elasticsearch side.

1 Like

Hi,
thanks very much for your reply. The article I was referring to (sorry, should have put that in the original question) can be found at DZone.

I'm very new to Elasticsearch and know next to nothing about Apache Cassandra but I am trying to figure out which software should be used for what.

Having said that, I also don't know about best practices regarding ingest and indexing so I can't tell you a good example for a use case where one would want to index the same document multiple times.

So would you say there's any need/use for Cassandra in combination with Elasticsearch compared to Elasticsearch alone?

Ah right, thanks. I don't see anything there that isn't basically true of stock Elasticsearch too. I can't really say that one or the other is definitely best for your use - you'll really have to look at the features they offer compared against your needs.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.