I've been stuck on a tricky requirement for quite a while.. So I have a time series data capturing when a user has used what feature as which identity, lie this:
The trace_id is used to identify an unique visitor interacting with the web app, and is computed from data like IP and User-Agent.
With this data lying in ES, I am hoping to answer "how many people used free trial features before registering?". I've created a query that answers the question. But I couldn't figure out how to get this metric as a cumulative sum over some time period..
date_histogram + cumulative_sum doesn't seem to work because each bucket interval's min date is fixed, whereas a visitor may trial the application a long while ago (can be months) before registration. So I guess what's really needed is a special "interval" setting that has a fixed start date (e.g., since whatever date the application became live) and a moving end date that grows daily/weekly/monthly?
PS: the query I used to get the number of users tried free features before registrations is below. Gotta admit that I dont quite like this query especially on how it determines whether the first event's is_free_trial in a bucket is true...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.