What MongoDB can do and ES cannot?

Hello,
I had a conversation with friend of mine about architectural choice for his
application.
Big amount of data needs to be queried in usual way (many reads, some
writes). Needs to be distributed, etc...
His idea was to have MongoDB working with ES in parallel to index subset of
the data from MongoDB.
I proposed to get rid of MongoDB completely since the subset is relatively
big, and store all the data just in ES and use it as DB as well as text
search engine.
So, ES would be NoSQL DB with text search features, which is actually a
super set of MongoDB (feature wise).
This way the application would be simpler, have less failure points, etc...
We both are not experts in MongoDB, nor could call ourselves experts in ES,
but more like experienced users. So, to dispel all doubts, could someone
who has experience in both fields propose some scenarios where usage of
MangoDB in conjunction with ES is justified?
For now, I'd say, if there is ES in the house, no need to bring MongoDB in.

Thanks in advance,
Eugene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hiya

I had a conversation with friend of mine about architectural choice
for his application.
Big amount of data needs to be queried in usual way (many reads, some
writes). Needs to be distributed, etc...
His idea was to have MongoDB working with ES in parallel to index
subset of the data from MongoDB.
I proposed to get rid of MongoDB completely since the subset is
relatively big, and store all the data just in ES and use it as DB as
well as text search engine.
So, ES would be NoSQL DB with text search features, which is actually
a super set of MongoDB (feature wise).
This way the application would be simpler, have less failure points,
etc...
We both are not experts in MongoDB, nor could call ourselves experts
in ES, but more like experienced users. So, to dispel all doubts,
could someone who has experience in both fields propose some scenarios
where usage of MangoDB in conjunction with ES is justified?
For now, I'd say, if there is ES in the house, no need to bring
MongoDB in.

I don't have more than a superficial knowledge of Mongo, so can't really
answer your question, but:

  • there are a number of people using ES as their single data store

  • Shay does not yet recommend doing so, as ES currently does not
    have an easy backup mechanism. It is in development and should
    be available in version 1.

  • ES is developing quickly, and pushing the limits of what has been
    done before. At times, we have uncovered bugs which can put
    your data at risk. For instance, v0.20.6 fixes a bug in Lucene
    which can cause data loss:
    TooManyOpenFiles might cause data-loss in ElasticSearch (Lucene) · Issue #2812 · elastic/elasticsearch · GitHub

  • I don't know the history of Mongo, but I'm sure they've had
    similar issues over time.

So my recommendations would be:

  • make sure you have a backup strategy in place, either by backing
    up the data/ dir on all nodes, or by having all of your data
    somewhere else, even if it is just JSON in text files

  • go through your requirements, and see if Elasticsearch supports
    them.

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Eugene,

Two facts which might be problematic for you using Elasticsearch as a
primary datastore. It does not support transactions, so its only atomic on
a single operation. It also has a limit with data availability for search.
It supports "near-realtime" search, so data will be not immediately
available for querying.

Michael

On Friday, March 29, 2013 4:48:59 PM UTC+1, Eugene Strokin wrote:

Hello,
I had a conversation with friend of mine about architectural choice for
his application.
Big amount of data needs to be queried in usual way (many reads, some
writes). Needs to be distributed, etc...
His idea was to have MongoDB working with ES in parallel to index subset
of the data from MongoDB.
I proposed to get rid of MongoDB completely since the subset is relatively
big, and store all the data just in ES and use it as DB as well as text
search engine.
So, ES would be NoSQL DB with text search features, which is actually a
super set of MongoDB (feature wise).
This way the application would be simpler, have less failure points,
etc...
We both are not experts in MongoDB, nor could call ourselves experts in
ES, but more like experienced users. So, to dispel all doubts, could
someone who has experience in both fields propose some scenarios where
usage of MangoDB in conjunction with ES is justified?
For now, I'd say, if there is ES in the house, no need to bring MongoDB in.

Thanks in advance,
Eugene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Michael,

but beware, MongoDB isn't also transactional! In the new 2.4 Release Mongo
have integrate Text search, like ES.
In Mongo you have much more features to structure the date, like embedded
collections, softlinks, etc.

Regards
Alex

Am Sonntag, 31. März 2013 21:38:13 UTC+2 schrieb mkleen:

Hi Eugene,

Two facts which might be problematic for you using Elasticsearch as a
primary datastore. It does not support transactions, so its only atomic on
a single operation. It also has a limit with data availability for
search. It supports "near-realtime" search, so data will be not immediately
available for querying.

Michael

On Friday, March 29, 2013 4:48:59 PM UTC+1, Eugene Strokin wrote:

Hello,
I had a conversation with friend of mine about architectural choice for
his application.
Big amount of data needs to be queried in usual way (many reads, some
writes). Needs to be distributed, etc...
His idea was to have MongoDB working with ES in parallel to index subset
of the data from MongoDB.
I proposed to get rid of MongoDB completely since the subset is
relatively big, and store all the data just in ES and use it as DB as well
as text search engine.
So, ES would be NoSQL DB with text search features, which is actually a
super set of MongoDB (feature wise).
This way the application would be simpler, have less failure points,
etc...
We both are not experts in MongoDB, nor could call ourselves experts in
ES, but more like experienced users. So, to dispel all doubts, could
someone who has experience in both fields propose some scenarios where
usage of MangoDB in conjunction with ES is justified?
For now, I'd say, if there is ES in the house, no need to bring MongoDB
in.

Thanks in advance,
Eugene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

We switched our "big data" application from Mongo to ElasticSearch with
great success, in fact most of what it does wasn't even possible with
previous versions of Mongo. Mongo 2.4 is looking interesting, but I still
don't think it can do what we needed. The biggest thing to remember is
that currently Mongo is at heart a nosql database with text search "added"
and elasticsearch is search with nosql database "added".

  • The biggest thing I miss from mongo is actually terminology related.
    Mongo uses "DB terminology" and ES uses "search terminology" for APIs and
    documentation. I've had many "who's on first" moments, you understand the
    words but the meaning is different, "index" being probably the biggest
    offender. I guess it never hurts to learn something new, although at times
    I wish there was a "normal" DB looking interface. :slight_smile:

  • Mongo has much better data durability, load balancing and HA from a
    database point of view. 0.90 is fixing some of the issues. If you can't
    afford to loose data then don't use ES as the single source. You will most
    likely loose part of a database at least once.

  • Mongo allows you to natively re-index (re-analyze) your data. ES
    currently requires aliases and that you re-insert (re-index) the data using
    an external program or script. Can be difficult to get right and not loose
    data if you have constant writes.

  • Mongo has better software version upgrades. Most of the time you need to
    restart the entire ES cluster with the new version. Makes it harder to
    test out new software. I'm hoping this will change as ES matures.

  • Mongo has typical DB security if desired, ES does not

Don't want to seem overly negative here. ES has tons of cool features, but
in the nosql db world it is still an toddler, if you have time to grow with
it, then go for it. The data durability issue is really the biggest thing
you need to live with, the rest you can work around.

Thanks,
Andy

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Andy,

thx for your interesting post, especially the comparing part for me,
because we plan to build a cloud based platform with mongo and postgres
as db backend and a searching service with ES.

Why you are switched your app to ES? I thought ES holds only index and
search relevant thinks like ids, keywords and small meta data as a
subset of the original dataset to return the ids to get the whole
dataset from db.
I'm new in the whole world of combine ES with NoSQL DBs like mongoDB. Is
there a good best practice how to combine these both parts. For example
what data should be "stored" in ES and what in mongo?

Best regards
Alex

Am 01.04.2013 07:52, schrieb Andy Wick:

We switched our "big data" application from Mongo to Elasticsearch with
great success, in fact most of what it does wasn't even possible with
previous versions of Mongo. Mongo 2.4 is looking interesting, but I
still don't think it can do what we needed. The biggest thing to
remember is that currently Mongo is at heart a nosql database with text
search "added" and elasticsearch is search with nosql database "added".

  • The biggest thing I miss from mongo is actually terminology related.
    Mongo uses "DB terminology" and ES uses "search terminology" for APIs
    and documentation. I've had many "who's on first" moments, you
    understand the words but the meaning is different, "index" being
    probably the biggest offender. I guess it never hurts to learn something
    new, although at times I wish there was a "normal" DB looking interface. :slight_smile:

  • Mongo has much better data durability, load balancing and HA from a
    database point of view. 0.90 is fixing some of the issues. If you
    can't afford to loose data then don't use ES as the single source. You
    will most likely loose part of a database at least once.

  • Mongo allows you to natively re-index (re-analyze) your data. ES
    currently requires aliases and that you re-insert (re-index) the data
    using an external program or script. Can be difficult to get right and
    not loose data if you have constant writes.

  • Mongo has better software version upgrades. Most of the time you need
    to restart the entire ES cluster with the new version. Makes it harder
    to test out new software. I'm hoping this will change as ES matures.

  • Mongo has typical DB security if desired, ES does not

Don't want to seem overly negative here. ES has tons of cool features,
but in the nosql db world it is still an toddler, if you have time to
grow with it, then go for it. The data durability issue is really the
biggest thing you need to live with, the rest you can work around.

Thanks,
Andy

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8yUt6dqGQRA/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Guys,.. thank you for your input.
I can agree about terminology, and "near real time" feature. But those are
more like minor inconvenience to me.
From my experience, I worked around those inconveniences, same way as we
work around inconveniences with SQL DBs.
I guess NRT allows ES to run crazy fast, and in Ajax based web applications
I always was able to make impression to users that they are getting Real
Time data. Only NRT Delete was kind of tricky, but still not a problem.
From responses, I see that only data loss is the major issue. I didn't have
such problem. I had to restored from daily back-ups, but purely of human
factor issues, not failure of the system.
I had nodes screwed up few times, but plain ES restart resolved that.
I remamber some conversations about MySQL long time ago, that it also could
lost some data at some point. Usually those conversations was started by
Oracle guys before Oracle bought Sun. But I don't know any actual case of
production data loss reported by MySQL users nor by ES users.
Another good point is that MongoDB could update a field and ES could update
only whole document. In some cases this is very critical, but again, in my
cases, if document structure is right, this is acceptable to reindex whole
document, even if only one field is really updated.

So, I'd really like to hear from someone who had problem with data loss. If
someone had such experience, could one share the information about how it
happened, what kind of data it was, how big, how much load it was, etc...
Because if only the reason of Data Loss was the mentioned bug related to
opened files limit issue, then once it is fixed, we have nothing to worry
about. Even this bug, if the system is configured right, and opened file
limit number set correctly, then it is very unlikely to occur.

Thank you,
Eugene

On Monday, April 1, 2013 1:52:57 AM UTC-4, Andy Wick wrote:

We switched our "big data" application from Mongo to Elasticsearch with
great success, in fact most of what it does wasn't even possible with
previous versions of Mongo. Mongo 2.4 is looking interesting, but I still
don't think it can do what we needed. The biggest thing to remember is
that currently Mongo is at heart a nosql database with text search "added"
and elasticsearch is search with nosql database "added".

  • The biggest thing I miss from mongo is actually terminology related.
    Mongo uses "DB terminology" and ES uses "search terminology" for APIs and
    documentation. I've had many "who's on first" moments, you understand the
    words but the meaning is different, "index" being probably the biggest
    offender. I guess it never hurts to learn something new, although at times
    I wish there was a "normal" DB looking interface. :slight_smile:

  • Mongo has much better data durability, load balancing and HA from a
    database point of view. 0.90 is fixing some of the issues. If you can't
    afford to loose data then don't use ES as the single source. You will most
    likely loose part of a database at least once.

  • Mongo allows you to natively re-index (re-analyze) your data. ES
    currently requires aliases and that you re-insert (re-index) the data using
    an external program or script. Can be difficult to get right and not loose
    data if you have constant writes.

  • Mongo has better software version upgrades. Most of the time you need
    to restart the entire ES cluster with the new version. Makes it harder to
    test out new software. I'm hoping this will change as ES matures.

  • Mongo has typical DB security if desired, ES does not

Don't want to seem overly negative here. ES has tons of cool features,
but in the nosql db world it is still an toddler, if you have time to grow
with it, then go for it. The data durability issue is really the biggest
thing you need to live with, the rest you can work around.

Thanks,
Andy

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I've had several problems with data loss, and I don't work for Oracle. :slight_smile:
You can find my issues (and others) in github or searching this mailing
list. Usually one node will delete one index for whatever reason. Some of
those reasons have been fixed, such as in 0.20.6, some I'm not so sure
about. The bigger issue is sometimes to recover you need to delete the
index, while I would rather just have a "hole" in the data. Also sometimes
in this state replication fails to fix it for whatever reason.

Another big ES issue is it is possible to run queries (especially facets)
that blow memory out of the water and then ES just hard exits and you hope
nothing gets corrupt.

I think you mis understood the update issue. ES does allow you to update a
single document at a time. I was talking about updating how the field is
mapped. If you want to change the mapping of a field you need to reload
the entire index. (Again search the mailing list, lots of folks hit this.)
You are correct that if you know your mapping ahead of time you won't hit
this.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

hi Clinton

Considering the enormous amount of value addition in ES since this original question was posted . Wondering, if the answer has tilted in favor of ElasticSearch ?

Can we safely say - ElasticSearch can be considered as a primary data store ?

Hi Samant,

While tremendous progress has been made towards this direction (this is a
major focus of the 1.4.0Beta1 release), this is not something that we would
recommend doing yet. It might be an acceptable solution in some cases, but
should you decide to use elasticsearch as a primary datastore, make sure to
read our resiliency status[1] page first to know about recent improvements
and known issues.

[1] Elasticsearch Platform — Find real-time answers at scale | Elastic

On Thu, Oct 16, 2014 at 4:52 PM, samant samant.rags@gmail.com wrote:

hi Clinton

Considering the enormous amount of value addition in ES since this original
question was posted . Wondering, if the answer has tilted in favor of
Elasticsearch ?

Can we safely say - Elasticsearch can be considered as a primary data store
?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/What-MongoDB-can-do-and-ES-cannot-tp4032654p4064962.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1413471142126-4064962.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6w-_wj_1gj9wgPxVWmiJTn_ZhvLEE6pQH3_4EEGJ2nRA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.