Is it possible to cumulative_sum over the metric of a moving interval?

di.lu · September 7, 2021, 9:54am

Hiii ES community!

I've been stuck on a tricky requirement for quite a while.. So I have a time series data capturing when a user has used what feature as which identity, lie this:

The trace_id is used to identify an unique visitor interacting with the web app, and is computed from data like IP and User-Agent.

With this data lying in ES, I am hoping to answer "how many people used free trial features before registering?". I've created a query that answers the question. But I couldn't figure out how to get this metric as a cumulative sum over some time period..

date_histogram + cumulative_sum doesn't seem to work because each bucket interval's min date is fixed, whereas a visitor may trial the application a long while ago (can be months) before registration. So I guess what's really needed is a special "interval" setting that has a fixed start date (e.g., since whatever date the application became live) and a moving end date that grows daily/weekly/monthly?

di.lu · September 7, 2021, 10:15am

PS: the query I used to get the number of users tried free features before registrations is below. Gotta admit that I dont quite like this query especially on how it determines whether the first event's is_free_trial in a bucket is true...

{
  "aggs": {
    "trace": {
      "terms": {
        "field": "trace.id.keyword",
        "size": 1000
      },

      "aggs": {
        
        "min_ts": {
          "min": {
            "field": "@timestamp"
          }
        },

        "free_trial": {
          "filter": {
            "term": {
              "is_free_trial": "true"
            }
          },
          "aggs": {
            "min_ts": {
              "min": {
                "field": "@timestamp"
              }
            }
          }
        },
        
        "num_registrations": {
          "filter": {
            "term": {
              "feature.keyword": "registration"
            }
          },
          "aggs": {
            "count": {
              "value_count": {
                "field": "event_type.keyword"
              }
            }
          }
        },

        "is_trial_then_register": {
          "bucket_selector": {
            "buckets_path": {
              "numRegistrationEvents": "num_registrations>count",
              "bucketMinTs": "min_ts",
              "freeTrialMinTs": "free_trial>min_ts"
            },
            "script": {
              "lang": "painless",
              "source": "params.numRegistrationEvents >= 1 && params.freeTrialMinTs == params.bucketMinTs"
            }
          }
        }

      }
    },
    
    "num_trial_then_register": {
      "sum_bucket": {
        "buckets_path": "trace>num_registrations>count"
      }
    }
  }
}

system · October 5, 2021, 10:16am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cumulative Sum Regardless of Time Interval - ES Elasticsearch	5	1673	June 24, 2019
Curious if theres a way to do Cumulative Sum on non Date/Historgram Aggregations Elasticsearch	1	286	March 19, 2019
Are cumulative sums approximate? Elasticsearch	1	514	December 14, 2016
Cumulative sum with range limited date histogram Elasticsearch	1	560	April 4, 2020
Cumulative sum till date Kibana	2	1536	November 24, 2017

Is it possible to cumulative_sum over the metric of a moving interval?

Related topics