I have an index with ~20000 products for an e-commerce website. I would like to implement a "previously purchased feature" for specific users.. so only specific products will return for specific users in this case... but also with a sort on their most frequently purchased item.
If there was no sorting, I think the best way to do this would be just to a store a keyword array of user_ids on each product.
So my products would look something like:
{
"name": "Product A",
.......
"bought_by_user_ids": [1,2,3,4,5,8,10]
}
And then my query would be something like
{
"query": {
"bool": {
"must": [
{ "match": { "name" : "Product" } }
],
"filter: [
{ "match": { "bought_by_user_ids": 1 } }
]
}
}
}
Where 1 is the current user's integer ID.
The kicker is I also want to sort on most commonly purchased by each user. So my thinking was instead of just keeping user_ids on the product, I can store an object field with user_ids as keys and the number of times they purchased the product as the values.
So the product would look like:
{
"name": "Product A",
.......
"bought_by_user_ids": {
"1": 5,
"2": 10
}
}
Meaning user with ID 1 has bought the product 5 times, and user with ID 2 has bought the product 10 times.
And the query would be an exists query.
{
"query": {
"exists": {
"field": "bought_by_user_ids.1"
}
},
"sort" : [{
"bought_by_user_ids.1" : {
"order" : "desc",
"unmapped_type" : "float"
}
}]
}
Where 1 is the current user's integer ID.
The solution works but I'm wondering if there's a better way to do this? I noticed I may eventually run into a total_fields limit because apparently every entry in the object is its own field? Any help is appreciated.
Thanks.