Does document database means denormalize

Jilles_van_Gurp · June 13, 2014, 9:37am

Yes, definitely think in terms of denormalizing. Joins are hard/expensive
in elasticsearch so you need to avoid needing to joing by prejoining. But
you have other options as well, see
Elasticsearch Platform — Find real-time answers at scale | Elastic

So, say you had a person table and a address table in a database, where you
have a 1:1 relation, that's a no brainer: shove the address in the person
index along with the rest of the person data.

If you had another table called company with a 1:n relation to person, it
gets more tricky. Now you have options.

Option 1: put the company data in the person index. Sure you are copying
data all over the place but storage is cheap and it is not like you are
going to have a trillion companies or persons. Your main worry is not space
but consistency. What happens if you need to change the company details?
Option 2: put the person objects in an array in the company objects. Fine
as long as you don't need to query for the persons separately.
Option 3: store just the company id in the person index or the person id in
the company index (array). Now you will end up in situations where you may
need to join and you'll have to fire many queries and manipulate search
results to do it, which is slow, tedious to program, and somewhat error
prone. But for simple use cases you might get away with it.
Option 4: use nested documents to put persons in companies. Now you can use
nested queries and aggregations, which give you join like benefits. Don't
use this for massive amounts of nested documents on a single parent.
Option 5: use parent child documents to give persons a company parent. More
flexibe than nested and gives you some performance benefits since parent
and child reside on the same shard. So same as option 3 but faster.
Option 6: compromise: denormalize some but not all of the fields and keep
things in a separate index as well.

With n:m style relations it gets a bit harder. Probably you don't want to
index the cartesian product, so you'll need to compromise. Any of the
options above could work. All depends on how many relations you are really
managing.

We've actually gotten rid of our database entirely. Once you get used to
it, thinking in terms of documents is much more natural than thinking in
terms of rows, tables, and relations. You have much less of an impedance
mismatch that you need to pretend does not exist with some object
relational library. It's more like here's an object, serialize it, store
it, query for it.

Jilles

On Friday, June 13, 2014 9:48:37 AM UTC+2, eune...@gmail.com wrote:

What I am asking is

Do different design decisions apply in elasticsearch compared to
relational

Is denormalized better for elasticsearch

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/69337cde-4962-4c9f-a59a-3c01d26440a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
To denormalize or not denormalize Elasticsearch	2	1225	February 25, 2020
Database Management Elasticsearch	2	259	March 23, 2022
How should I denormalize my data? Elasticsearch	1	1511	July 17, 2019
Understanding elasticsearch and index creation Elasticsearch	2	1498	February 18, 2018
How best to Denormalize a SQL schema Elasticsearch	1	198	December 22, 2023

Does document database means denormalize

Related topics