Most efficient way to model data in Elasticsearch

I have an example of modelling an commence site. Say that the site has few hundreds shops and few millions products. The products per shop range: 1000-100.000 products/shop. I need to be able to aggregate the products and the shop fields. All the products and all the shops have the same schema.




  1. Is it more efficient to have a) 1 index/shop, b) same index and 1 type/shop or c) same index, same type and have a field to determine the shop of the product?

I read some related articles and most of them are in favour of same index and 1 type/shop. But then they say that if there is one single index which has a large number of docs it might be even slower than having multiple indices.

  1. I also need to perform JOINS and aggregations between the shops and the products. For example I need to be able to retrieve all the products from the shops with rating higher than 8/10 and also get the number of products per category. Is it preferable to use a) application-side JOIN, b) parent-child relationships, c) Siren plug-in, d) something else?
  1. Different indices.
  2. Why do you think you need joins?
  1. Thnx for that. Can you please elaborate a bit more on why you think 1 index is better/faster?
  2. Because first I need to get all the shops with rating above 8/10 and then join that with the products from these shops.

ES doesn't support joins. You have to model your data differently, typically by de-normalizing it.