Significant Terms Recommendations

(Bruno Zeraik) #1

I'm new to elasticsearch and Im developing product recommendation like 'People who bought this, also bought...'

I have an index named "products" which I use for faceted filtering (prices, category, brand and so on...), each document in that index has all the informations needed to show a full page of products. So with just one query in elasticsearch, I can build the page.
For the recommendation system, I thought about using the significant terms in my completed orders, so I can suggest products bought together to a X product that was added in cart.

So the index would be something like this:

{"products": ["sku1", "sku2", "sku3", "sku4"]}
{"products": ["sku1", "sku2", "sku3"]}
{"products": ["sku1", "sku3", "sku3", "sku5"]}

when I query, using significant terms from elastic search, the result will be something like this:

{key: "sku1", score: 0.8},
{key: "sku2", score: 0.5},

With that I have the identifiers (sku's) from products, but I dont have all the information, is it right doing two queries to build my recommendation shelf?
And what if I wanted to recommend products filtering by price? (eg. products bought together but with price lower than $80) I would need to query in products by sku and price?

(Mark Harwood) #2

If I understand your example correctly you're using previous orders to make the associations between products. So the index you will be querying is the orders index and looking for significant terms in the products field.
You say that you also want to then filter the product suggestions by price - you are right that this would need to be handled in your app using a subsequent query on an different index like products where given the sku you can get the price (and colour/size etc).
You should be able to set the size parameter on the significant_terms agg to something large e.g. 10000 skus to give you a large base to then filter. Looking up many IDs in a search may prove to be expensive.

You could try filter the significant_terms agg with a query for orders with products not exceeding the required price range but that might be over-aggressive as whole orders may be filtered if they contain expensive products that are nothing to do with your search term or suggested "also-boughts.

(Bruno Zeraik) #3

Hi Mark, thanks for your reply
Yes, you are right, I'm using previous order to do the recommendation system.

Any idea of a recommendable size that would not be expensive if I decide to query products index by skus?

How would be the orders index to query for products within a price range? Since this information is only available in products index

(Mark Harwood) #4

I presumed orders might contain prices. Ignore me if not.

"It depends" - on your data ad hardware. Benchmarking is the way forward.

(Bruno Zeraik) #5

Yes, it contains prices, but the problem is that the products price changes frequently, so it wouldn't be easy to keep it synchronized. And maybe I would like to filter by other properties (category, brand, ...)

Ok, lets say I dont want to filter the products returned by the significant terms. I would need the same way to do a subsequent query to get the products information within products index, querying just by the skus. This is a right approach?

(Mark Harwood) #6

You'd want to query the products index - use a bool query with 2 must or filter clauses - one is a terms query listing the skus and the other is a range query with the price restriction

(Bruno Zeraik) #7

Thanks Mark

I'll try it

(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.