Relationship best practices

Hi!

I want to ask about the best practice for design index data for related structures.

For example (customers and related orders):

customer: {
    id: "...",
    name: "...",
    ...
}

order: {
     id: "",
     customer_id: "...",
     product_name: "...",
     product_description: "...",
     ...
}

I want to find customers by order details.

How should I store this structure?

I know several ways: parent-child relationship or nested documents.
But I think, that is too redundant and expensive. Order details can be changed and customer details can be changed too (not at the same time).

Can I retrieve customers by steps?:
GET /orders (retrieve customer_ids)
GET /customers (by retrieved ids and other customer data)

Can it be done by means of multi search API:

It allows to store customers and orders in separate indexes.

PS. Apache Solr has simple Join mechanism for the same purpose.
https://wiki.apache.org/solr/Join

I also read:

That will depend on your access patterns and what you see as cost. For some people more latency (because they need to resolve parent-child or do multiple queries) is expensive, for others disk and memory are expensive. That's really up to your use, queries, and what you see as your main cost.

The decision between parent-child vs application-side join is how you will most likely access your data. If the majority of your queries will need to access them together, parent-child would make sense. If you frequently want to access the entities independently, the application-side join is probably better.

While you can use multi-search, you'd need to know what you are searching in the initial query. I assume what you want is a search on orders, then your application extracts all customer_ids from that and can finally do a multi-get on all of those IDs.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.