I know Elasticsearch doesn't do joins in the same way SQL does. I'm wondering what is the best approach in the following scenario:
I have a Products index which contains product data and prices. I have other indexes, such as Orders, ShoppingCarts, Wishlists, etc. which rely on the Product data. I can nest Product documents in these other indexes, but it doesn't seem logical to me because whenever Product data is changed, I need to go change every single nested Product document. The other option is to just store the Product key in these other indexes. Then I would need to make 2 calls to elastic (i.e. one to get Order Lines, then use the Product keys to go get Product documents, and then merge the data in my code)
I am wondering what is the best strategy for this?
That's a fairly open question and depends on your access patterns. If you need to find orders based on product descriptions e.g. orders containing "milk AND cornflakes" then maybe you want to denormalize and include the product descriptions on the orders. Otherwise you need to query the product index first to get all the product IDs for milk or cornflake products and then query on the order index using the ids e.g. "(id1 OR id7 OR id99 OR ...) AND (id3 OR id54 OR....)". The same applies for any other product attributes e.g product prices or categories.
There's also an argument for capturing product attributes on orders to preserve the information current at that point in time which may later change e.g. product price.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.