Hey,
Excellent discussion, let me add my view on some of the
things referred to here:
There are several reasons you would like to add a nosql or even database
(shudder ) alongside elasticsearch, but the main one is that they all
have really cool features that elasticsearch does not provide (some of it
will never provide due to the "search" nature). Some examples include
relational model (you hear that, I called this cool ; ) ), cross operation
transactions, or nosql specific features such as couchdb unique way of
handling changes.
Other features might eventually end up implemented in elasticsearch, such as
full real time, or "parent child" relationships.
Regarding storing just what you want to search on and be displayed, then I
agree as well. Note that no matter how fast your nosql if choice is, its
always faster to return the data as part of the search request, then
returning a list of ids and looking them up in a nosql. By the way, if
someone is up to an interesting test, then test elasticsearch key based
lookup performance against other nosqls, you will be surprised at the
results (and elasticsearch does no caching on key based lookups, except
for the os file system cache of course).
Regarding mongodb, then yes, sadly, the only options I see in integrating it
with elasticsearch is by doing it on the "application" layer, by applying
the same operations done on both mongodb and elasticsearch. This can be
abstracted away within the mongodb client driver of choice, which would make
life much simple.
If mongodb had a post commit hooks, or a way to get the stream of changes
happening on it, then other integration points would have been possible.
Regarding the "update" option. An update is a delete and then reindexing
that document. This is called "index" in elasticsearch and is actually the
default mode when indexing data. There is an option to "create" a document,
which will do no deletion in advance, which will result in better
performance, but if you have two "creates" with the same id, then two
documents will exists for it.
-shay.banon
On Sun, Oct 3, 2010 at 3:26 AM, Martijn Laarman m.laarman@datheon.comwrote:
To all wondering why you might still need a nosql solution next to
Elasticsearch here's my two cents.
Elasticsearch is still NRT whereas with (most) nosql document stores adding
a document will make it available immediately afterwards (Real time).
Another reason is indexing only those fields you want to search and storing
only those fields that needs to be shown (on overviews, lists, dropdowns
etcetera) can be a huge query performance booster. A nosql solution is more
geared towards quickly looking up object (graphs) by ID as fast as possible
and querying for ranges of objects is limited or crude.
In web applications using ES for overviews, lists, dropdowns and nosql for
item pages (view, edit objects) really gives you best of both worlds.
Side Note: jlist9, updating a SOLR document should not require a delete
prior either.
On Sat, Oct 2, 2010 at 10:36 PM, James Cook jcook@tracermedia.com wrote:
We use Hazelcast in front of Elastic Search. All data is put into
Hazelcast first which then has a bridge that writes it to ES. We also write
to MySQL at the same time, until ES rises from beta.
Hazelcast gives us a distributed cache for identity retrieval, while we go
directly to ES for queries. Hazelcast also gives us the transactional
controls that MongoDB and ES lack.
-- jim
On Sat, Oct 2, 2010 at 4:03 PM, jlist9 jlist9@gmail.com wrote:
Hi Clint,
Thanks for the reply.
You don't have to delete it, you just "index" the doc again, and it
will
take care of removing the old copy.
Good to know this.
kimchy, the developer, recommends against this at the moment, at least
until ES reaches 1.0 - Things are still changing, and recent releases
have changed the long term storage and temporary work storage structure
and required re-indexing, so you still need your data available
elsewhere.
I see. Thanks for explaining. I'll keep a separate datastore for now.
But I guess the moral of the story is that everyone is hoping for an
integrated datastore and search engine Going back to my original
email, maybe ES will take off if some language driver level integration
are implemented for the popular nosql stores such as mongodb. All
the user will need to do is to update the mongodb docs and the
ES index is updated automagically!
Yes - it exposes all (most?) of Lucene's API via its query DSL, and
adds
a few really nice features.
See:
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/
That's great! Although I'm currently using solr, I really like the
distributed
nature (although I don't need it now but it's there if you need it) and
the
simple REST interface of ES. I'll definitely keep it in mind when I start
my next project.
Jack