Here documents:
{"ip" : "1.1.1.1", "event" : "click"}
Where IP is the user IP, EVENT the event (click | buy | home | basket ...).
I want to know the average count of each event, distinct by IP. Using SparkSQL with ES, this is done by this query:
SELECT AVG(total), event
FROM (SELECT COUNT(*) AS total, client.ip, event.name AS event
FROM my
GROUP BY event.name, client.ip HAVING total < 100) my2
GROUP BY event
That gives:
[1.5374449339207048,click]
[6.683713173985548,error]
[1.0,account.login.check]
[1.1481481481481481,account.login]
[1.2649006622516556,buy]
[1.25,settings.update]
In ES, I try to use the new pipeline aggregation but I didn't manage to find the right aggs imbrication to make it working ;-(