I currently have a use case where I need to store lots of product information that needs to be searchable.
This data is updated many times a day. Each product could have any number of attributes associated with it depending on the type of product and where it will be sold. At any given time, any of the fields could be updated, but not necessarily all of them every time, so it would just be lots of partial updates. We are talking about tens of millions of products.
When searching, the user would expect to only get the most recent value per field per product. We would also like the user to be able to look at a specific product and see what has changed over time.
My initial thinking was that a new document would be inserted for each change ( a partial document ), and then a composite of those changes would be retrieved on a query. So for example, if two fields were updated at once, it would create a single document with a unique ID that product holds, along with the value of those two fields. Then another change is made, updating two different fields. When the user queries for this product they would see all four fields. Looking at querying options in ES, it seems this would be cumbersome from the application side.
Is there a better way to model this data? Also please tell me if I am not explaining this properly.