Help on pipeline aggregation


(Thomas Decaux) #1

Here documents:

{"ip" : "1.1.1.1", "event" : "click"}

Where IP is the user IP, EVENT the event (click | buy | home | basket ...).

I want to know the average count of each event, distinct by IP. Using SparkSQL with ES, this is done by this query:

SELECT AVG(total), event 
FROM (SELECT COUNT(*) AS total, client.ip, event.name AS event 
    FROM my
    GROUP BY event.name, client.ip HAVING total < 100) my2
GROUP BY event

That gives:

[1.5374449339207048,click]
[6.683713173985548,error]
[1.0,account.login.check]
[1.1481481481481481,account.login]
[1.2649006622516556,buy]
[1.25,settings.update]

In ES, I try to use the new pipeline aggregation but I didn't manage to find the right aggs imbrication to make it working ;-(


(system) #2