Modeling relations into elastic


(Lior) #1

I'm currently working on desiging my app's DB.
I have several entities in the system, such as:
*Persons
*Bank Accounts
*Money transactions

Each bank account has one or more owners.
Each money transaction is between 2 bank accounts.

My DB should answer simple questions such as
"All people who are living in Paris"
"All money transactions in the last week"

but also "relational" questions such as
"All bank accounts of persons that are older than 40 years"
"All money transactions in London"

I thought modeling each entity as an index, and also add new index: relations. Now i have 2 options:

  1. use ids -> "source_id", "target_id", "relation properties"
    In this option i will filter on the entity itself (person older than 40), and then i'll search on the ids. It can be a lot of data, and a lot of time.

  2. denormalize -ALL- the searchable data on the relation:
    "source_person_id", "source_person_name", "source_person_age"...
    This is very expensive option, and each update of an entity in the system will cause search all the relations and update it also

Is there any other way? If no, which of the above options are better?
I'll be happy to hear some opinions before starting developing

Thanks,

Lio


(Adrien Grand) #2

Elasticsearch tends to not like queries on large numbers of ids, so I would usually recommend going with denormalization.

Note that there are some ways that you can make it easier to manage. For instance, it is easier to track the birth date of the owner of the account than his/her age since the birth data never changes. Also, you do not have to denormalize all properties of the owner, only those that you need for querying.

One thing that you did not mention is parent/child. https://www.elastic.co/guide/en/elasticsearch/reference/5.1/mapping-parent-field.html Be warned that relational queries will be slow, but that should still be significantly faster than the first approach that you described in your message.


(Lior) #3

Hey Adrian,
thanks on the reply

I didn't mentioned the parent/child, because a child can have only one parent. In my case, money transaction has two parents - 2 bank accounts, and the bank account is also can have multiple owners.

Is the second approach will perform well on large scale? sayi need to make even 1,000,000 updates if an entity changed, how fast will ES make it?


(Adrien Grand) #4

1M updates would certainly take a lot of time. Say you manage to perform 1000 updates per second, that would be 17 minutes.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.