I recently read an aritlce about using ElasticSearch as a Recommendation System. Now I'am trying to implement my own simple "Frequently buyed together" Recommender based on ElastisSearchs aggregations and an articleId as input.
A document in my dataset contains the following fields (I parsed a CSV file via logstash):
articleId
orderId
orderId / articleId
count
revenue
So different than in the mentioned article my dataset is "flat". The documents in ElasticSearch are representing the purchase of a single article and not the whole "receipt". The connection to the "receipt" is made by the orderId.
I tried to use nested aggregations :
terms on orderId -> terms on articleId -> filter on a specific id ...
But now I'am kind of stuck. Do you think my dataset suits this task? Do you have any ideas how to help me?
What I want in the end is a list of article numbers which are frequently ordered together with a certain article. So far I tried to identify orders which a containing the input article (12345):
Identifying the orders containing the article can only be a part of the solution and even with this query I still get all the orders not containing the article. So I'am not sure if I am on the right track.
You need to create order docs where purchased item IDs are held in an array on the order doc. Query for an item ID and use the 'significant_terms' aggregation on the order itemIDs field to find strongly related other purchases.
See Graph explore api for webshop case
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.