Cassandra + Elastic Search


(Ben McCann) #1

Hi,

I currently have numerous documents stored in Cassandra. However, they're
difficult to index. Cassandra does not have native support for indexing
documents (especially nested documents), so my datastore at this point is
just storing these docs as blobs associated with some key. I'm thinking
about adding elastic search to my setup to index these docs. Would it make
more sense to have ES index my Cassandra datastore or should I ditch
Cassandra entirely to have ES store my docs? Getting rid of Cassandra
would mean one less service to run in my cluster, but I'm wondering if
there's something I might be giving up if I do this like durability, etc.?

Thanks,
Ben


(ppearcy) #2

I recommend to keep Cassandra as your primary data store, for now. ES
has been much improved in terms of durability, but I believe is still
not recommended as the primary data store.

On Apr 2, 3:05 pm, Ben McCann benjamin.j.mcc...@gmail.com wrote:

Hi,

I currently have numerous documents stored in Cassandra. However, they're
difficult to index. Cassandra does not have native support for indexing
documents (especially nested documents), so my datastore at this point is
just storing these docs as blobs associated with some key. I'm thinking
about adding elastic search to my setup to index these docs. Would it make
more sense to have ES index my Cassandra datastore or should I ditch
Cassandra entirely to have ES store my docs? Getting rid of Cassandra
would mean one less service to run in my cluster, but I'm wondering if
there's something I might be giving up if I do this like durability, etc.?

Thanks,
Ben


(Radu Gheorghe) #3

I think it depends on how "critical" your data is. But since you would
use ES for searching whatever the setup, the "durability" part
probably boils down to how long it takes and how complicated it is to
reindex your data in case something goes wrong. Because if ES fails,
you can't search properly even if you have Cassandra working well
underneath.

There are some threads here about how you can back up your ES indexes.
And using ES alone comes with quite a lot of gains in terms of
simplicity and performance.

I would keep Cassandra it would be critical for me to keep the
unindexed data in a durable database. Otherwise, I would ditch it
and try to find a way to recover in case something goes wrong. Which
can happen with Cassandra as well - we just assume ES is more prone to
that because it's a younger product.

On Apr 3, 7:13 pm, ppearcy ppea...@gmail.com wrote:

I recommend to keep Cassandra as your primary data store, for now. ES
has been much improved in terms of durability, but I believe is still
not recommended as the primary data store.

On Apr 2, 3:05 pm, Ben McCann benjamin.j.mcc...@gmail.com wrote:

Hi,

I currently have numerous documents stored in Cassandra. However, they're
difficult to index. Cassandra does not have native support for indexing
documents (especially nested documents), so my datastore at this point is
just storing these docs as blobs associated with some key. I'm thinking
about adding elastic search to my setup to index these docs. Would it make
more sense to have ES index my Cassandra datastore or should I ditch
Cassandra entirely to have ES store my docs? Getting rid of Cassandra
would mean one less service to run in my cluster, but I'm wondering if
there's something I might be giving up if I do this like durability, etc.?

Thanks,
Ben


(system) #4