I'm currently working on desiging my app's DB.
I have several entities in the system, such as:
*Persons
*Bank Accounts
*Money transactions
Each bank account has one or more owners.
Each money transaction is between 2 bank accounts.
My DB should answer simple questions such as
"All people who are living in Paris"
"All money transactions in the last week"
but also "relational" questions such as
"All bank accounts of persons that are older than 40 years"
"All money transactions in London"
I thought modeling each entity as an index, and also add new index: relations. Now i have 2 options:
use ids -> "source_id", "target_id", "relation properties"
In this option i will filter on the entity itself (person older than 40), and then i'll search on the ids. It can be a lot of data, and a lot of time.
denormalize -ALL- the searchable data on the relation:
"source_person_id", "source_person_name", "source_person_age"...
This is very expensive option, and each update of an entity in the system will cause search all the relations and update it also
Is there any other way? If no, which of the above options are better?
I'll be happy to hear some opinions before starting developing
Elasticsearch tends to not like queries on large numbers of ids, so I would usually recommend going with denormalization.
Note that there are some ways that you can make it easier to manage. For instance, it is easier to track the birth date of the owner of the account than his/her age since the birth data never changes. Also, you do not have to denormalize all properties of the owner, only those that you need for querying.
I didn't mentioned the parent/child, because a child can have only one parent. In my case, money transaction has two parents - 2 bank accounts, and the bank account is also can have multiple owners.
Is the second approach will perform well on large scale? sayi need to make even 1,000,000 updates if an entity changed, how fast will ES make it?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.