I have read through some threads where people mention that their only
datastore is ES. From what I have read it can be an option.
However, can you point out some drawbacks in doing so? My main no-go
is transaction support. I have another project using Google App Engine
and have an app that does not use transactions. It can be done but
requires more thinking.
I am currently syncing ES with MySQL, including transaction support.
However, I would love to let go of MySQL and not have to worry about
scaling MySQL.
I have read through some threads where people mention that their only
datastore is ES. From what I have read it can be an option.
However, can you point out some drawbacks in doing so? My main no-go
is transaction support. I have another project using Google App Engine
and have an app that does not use transactions. It can be done but
requires more thinking.
I am currently syncing ES with MySQL, including transaction support.
However, I would love to let go of MySQL and not have to worry about
scaling MySQL.
I have read through some threads where people mention that their only
datastore is ES. From what I have read it can be an option.
However, can you point out some drawbacks in doing so? My main no-go
is transaction support. I have another project using Google App Engine
and have an app that does not use transactions. It can be done but
requires more thinking.
I am currently syncing ES with MySQL, including transaction support.
However, I would love to let go of MySQL and not have to worry about
scaling MySQL.
Well, just hooking into the lifecycle events from our service layer
and replicating the cruds to ES. It's a lot like Hibernate Search for
those who know.
At first, I was thinking of not modifying the service layer. So I tried to use hibernate listeners but it didn't work really fine because when you update a child entity only, even if you ask to hibernate to merge from the parent entity, the listener is only called for the child entity. I didn't find an easy way to do that.
Thanks
David
Le 24 juin 2011 à 05:54, Remy Gendron remy@arrova.ca a écrit :
Well, just hooking into the lifecycle events from our service layer
and replicating the cruds to ES. It's a lot like Hibernate Search for
those who know.
Some of the drawbacks of using ES as a datastore are:
Lack of transactional support (as you mention). We use Hazelcast as a
memcached layer between our application and ES. Hazelcast supports
distributed transactions. While not supporting commit and rollback upon a
write failure, it does give us the ability to commit multiple writes as a
single unit of work.
No snapshotting of data. I miss this greatly for peace of mind. You
have to code your own solutions to be able to recover from corruption of the
Lucene indexes, as well as when ES introduces a change that requires a
reindexing of all the data. A shared gateway can help, but you will have to
pause indexing/flushing while making a backup of the repository.
Near real time behavior is hard for most who come from DB background.
You can't insert a record then query for that record without issuing a
refresh call or delaying long enough to ensure the record has been indexed.
No SQL support. It goes without saying, but this has an impact in such
there are no tools which allow you to manipulate data once it is in the
repository. Perhaps these will come in time.
I have read through some threads where people mention that their only
datastore is ES. From what I have read it can be an option.
However, can you point out some drawbacks in doing so? My main no-go
is transaction support. I have another project using Google App Engine
and have an app that does not use transactions. It can be done but
requires more thinking.
I am currently syncing ES with MySQL, including transaction support.
However, I would love to let go of MySQL and not have to worry about
scaling MySQL.
Some of the drawbacks of using ES as a datastore are:
Lack of transactional support (as you mention). We use Hazelcast as
a memcached layer between our application and ES. Hazelcast supports
distributed transactions. While not supporting commit and rollback upon a
write failure, it does give us the ability to commit multiple writes as a
single unit of work.
No snapshotting of data. I miss this greatly for peace of mind. You
have to code your own solutions to be able to recover from corruption of the
Lucene indexes, as well as when ES introduces a change that requires a
reindexing of all the data. A shared gateway can help, but you will have to
pause indexing/flushing while making a backup of the repository.
Near real time behavior is hard for most who come from DB
background. You can't insert a record then query for that record without
issuing a refresh call or delaying long enough to ensure the record has been
indexed.
No SQL support. It goes without saying, but this has an impact in
such there are no tools which allow you to manipulate data once it is in the
repository. Perhaps these will come in time.
I have read through some threads where people mention that their only
datastore is ES. From what I have read it can be an option.
However, can you point out some drawbacks in doing so? My main no-go
is transaction support. I have another project using Google App Engine
and have an app that does not use transactions. It can be done but
requires more thinking.
I am currently syncing ES with MySQL, including transaction support.
However, I would love to let go of MySQL and not have to worry about
scaling MySQL.
Shay, is this something that is considered on the long term roadmap,
things such as index migration when a new release comes out,
transactions, full restore, etc? I guess that if Lucene isn't meant to
be a datastore, it would be hard for ES to provide this...
Some of the drawbacks of using ES as a datastore are:
Lack of transactional support (as you mention). We use Hazelcast as
a memcached layer between our application and ES. Hazelcast supports
distributed transactions. While not supporting commit and rollback upon a
write failure, it does give us the ability to commit multiple writes as a
single unit of work.
No snapshotting of data. I miss this greatly for peace of mind. You
have to code your own solutions to be able to recover from corruption of the
Lucene indexes, as well as when ES introduces a change that requires a
reindexing of all the data. A shared gateway can help, but you will have to
pause indexing/flushing while making a backup of the repository.
Near real time behavior is hard for most who come from DB
background. You can't insert a record then query for that record without
issuing a refresh call or delaying long enough to ensure the record has been
indexed.
No SQL support. It goes without saying, but this has an impact in
such there are no tools which allow you to manipulate data once it is in the
repository. Perhaps these will come in time.
I have read through some threads where people mention that their only
datastore is ES. From what I have read it can be an option.
However, can you point out some drawbacks in doing so? My main no-go
is transaction support. I have another project using Google App Engine
and have an app that does not use transactions. It can be done but
requires more thinking.
I am currently syncing ES with MySQL, including transaction support.
However, I would love to let go of MySQL and not have to worry about
scaling MySQL.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.