Aggregate query count having count

Petr.Simik · November 6, 2019, 10:06am

I need to create visualisation 4 lines
(I want to identify how many customers in time are using how many devices)
count (unique customer) having count(uniqued deviceid) = 2
count (unique customer) having count(uniqued deviceid) = 3
count (unique customer) having count(uniqued deviceid) = 4
count (unique customer) having count(uniqued deviceid) > 4

In Index I have 3 field
@timestamp - date
customer - keyword
deviceid - keyword

this is a log from application, one customer can have several documents with same deviceid. This

How I can achieve this ?
thank you

Petr.Simik · November 7, 2019, 11:44am

THis is what I am looking for

is there a way how to achieve this in elasticsearch?

count (unique customer)
group by count(unique device_id)

Mark_Harwood · November 12, 2019, 3:26pm

This is sort of behavioural analysis will be hard to do using a distributed index on a lot of data. It requires a lot of joins based on a customer key and distributed joins are expensive in any system.

You'll likely need to build an "entity centric" index (each doc = one customer).
The new transform api can help collapse the device id for each customer doc using a cardinality aggregation. Once you've built the customer index you can use a date histogram in Kibana on it and and plot the lines using custom ranges for the 4 device ownership ranges you listed.

However, the challenge here is the date info - presumably you want that a customer could appear once in the line for February with 2 devices but again in March with 3 devices? This would mean the docs would not be customer documents but customer-as-at documents which summarise the person's device count on that particular date. That will likely require custom script to build that sort of index.

Petr.Simik · November 12, 2019, 3:43pm

Great I did not know about transform api , I am looking at documentation
this seems to be useful in my case.
I will give it a try. I hope I will be able to create code to transform the data for further analysis.

system · December 10, 2019, 3:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.