How to search association rules for best match using ES


(Zoran Jeremic) #1

Hi,

I'm developing Java application that should recommend users activities that are most often used to reach specific competence. My approach was to use association rules, so I already have generated association rules that are stored in ES in the following form:

 {  "id": 24,
       "support": 1,
        "confidence": 0.671,
        "itemset1": [{"id": 3},{"id": 8},{"id": 12}],
        "itemset2": [{"id": 4},{ "id": 7}, {"id": 8 }]
    }

As an input in the system I have competence id (e.g. 24), and a list of activities that user already have passed (e.g. 3,8,12,17). This list of activities should be compared with itemset1 in order to find best match. It doesn't have to be exactly the same match, but closest one. After that, I would use itemset2 as a recommendation. Confidence is also relevant, and it would be great if this value could affect the score of the results.
In ideal case, these value should not be present in itemset2.

My problem is how to create appropriate query that will find the best match. I tried with the simplest scenario which will filter out rules only by comparing itemset1 with input list. I tried to create terms for each input activity, and to combine it with AndFilterBuilder. However, this query works only if input list is subset of itemset1. For example, list(3,8,12,17) will not return above association rule.
I also tried with bool filter and should for each term, but that's not working too.

I hope somebody could give me idea how to solve the problem.

Thanks,
Zoran


(Adrien Grand) #2

Can you elaborate on why this did not work correctly with a bool query with should clauses?


(Zoran Jeremic) #3

Sorry, it works with bool query and should clauses. For some reason I missed that at the beginning and I focused all my attempts on filters. I made this work as I wanted, but I got into the other problem.
In my second scenario, I have to find those rules where "itemset1" contains 1 element that exactly match given input.
For example:
for input: id=24 and item=3, I should get
{"id":24, "itemset1":[{"id":3}]...} but not {"id", "itemset1":[{"id":3},{"id":4}]...}

Could you please suggest which type of query/filter should be used for that? I guess this can be done if I store one extra parameter list_size, and then add search for those documents that have the size=1, but I thought that I can do that from query.
Thanks,
Zoran


(Adrien Grand) #4

In order to do that, you could store the size of itemset1 as a property of your documents and then add a filter to make sure that you only find documents whose item set contains exactly one element?


(system) #5