I am working with a combination of Neo4J and ES in a recommendation engine, I have an algorithm that generates a score between user and product, in Neo4J. I also want to add 3 similar product objects , similar1, similar2, similar3, which can contain up to a 30k items.
What is the best way of storing the score and the similar products in ES?
1 - Create a an object within the product doc with the user_id as the key and the score as the value like this
2 - Create a a separate index for each user with scores besides each product and again separate indices for each similar set where the doc _id matches that of the product.
My concern with the first is that when the application goes starts to scale if I reach a couple of million users, the documents could end up reaching the 2gb Lucene size limit.
With the 2nd option and the similar product objects from option 1, how would I build a query that could return similar product data from the multiple indices as child objects within the product doc and score the product doc by the user score index