"Often buyed together" using aggregations?


(Christof Nolle) #1

Hi everybody,

thank you for reading this post :slight_smile:

I recently read an aritlce about using ElasticSearch as a Recommendation System. Now I'am trying to implement my own simple "Frequently buyed together" Recommender based on ElastisSearchs aggregations and an articleId as input.

A document in my dataset contains the following fields (I parsed a CSV file via logstash):

  • articleId
  • orderId
  • orderId / articleId
  • count
  • revenue

So different than in the mentioned article my dataset is "flat". The documents in ElasticSearch are representing the purchase of a single article and not the whole "receipt". The connection to the "receipt" is made by the orderId.

I tried to use nested aggregations :

terms on orderId -> terms on articleId -> filter on a specific id ...

But now I'am kind of stuck. Do you think my dataset suits this task? Do you have any ideas how to help me?

Thank you very much!


(David Pilato) #2

Can you share some concrete examples, what you tried so far and the kind of result you are expecting?

Ideally provide a full recreation script as described in

It will help to better understand what you are doing.
Please, try to keep the example as simple as possible.


(Christof Nolle) #3

Hi David,
thank you for your quick reply. A purchase document in my ES index looks like this:

{
"orderId/articleId": "123/456",
"date": "2016-10-31T23:00:00.000Z",
"count": 1,
"orderId": "123",
"articleId": "456",
"revenue": 5
"currency": "EUR"
}

What I want in the end is a list of article numbers which are frequently ordered together with a certain article. So far I tried to identify orders which a containing the input article (12345):

{
    "aggs": {
        "baskets": {
            "aggs": {
                "articles": {
                    "aggs": {
                        "with_article": {
                            "filter": {
                                "term": {
                                    "articleId.keyword": "12345"
                                }
                            }
                        }
                    },
                    "terms": {
                        "field": "articleId.keyword"
                    }
                }
            },
            "terms": {
                "field": "orderId.keyword"
            }
        }
    }
}

Identifying the orders containing the article can only be a part of the solution and even with this query I still get all the orders not containing the article. So I'am not sure if I am on the right track.


(Mark Harwood) #4

You need to create order docs where purchased item IDs are held in an array on the order doc. Query for an item ID and use the 'significant_terms' aggregation on the order itemIDs field to find strongly related other purchases.
See Graph explore api for webshop case


(Christof Nolle) #5

Thank you very much for your help! I got what I wanted using Logstashs 'Aggregation Filter'.


(system) closed #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.