Hi @jpountz! Again appreciate your reply. Sorry I was in transit yesterday and could not reply.
Let me try to explain my scenario and what Im after:
We are an adserver. Every ad hit that comes to our adservers, we create a document on ES that looks like this:
My current project is to switch our statistics page to ES. On our stats page we display information pertaining to each campaign_id where users can check out info about a campaign, the most important metric being impressions and uniques by campaign_id.
At the moment, we have some really cumbersome system to count impressions/uniques by using nginx logs instead of a straight INSERT into mySQL the moment the ad hit happens, since we deal with a lot of traffic. So our current setup involves parsing the info from the logs to database, where I simply use a DISTINCT() on the ip_address field to count unique hits between dates inputted by user. A typical query would be something like:
SELECT campaign_id, count(*) as impressions, count(distinct(ip_address)) as uniques FROM table WHERE timestamp BETWEEN X AND Y
The plan now is to collect all this data to ES instead and run queries to it, and be able to calculate the impressions as well as an approximate number of unique users that have hit our ads between 2 timestamps based on their IP address since this is probably the most accurate way to describe a single visitor.
I gave 2 examples on my first post because I will use the date histogram to generate an hourly chart as well as a total unique count per campaign when needed by user.
I hope I have explained it properly!
Thanks so much again!