Can one ElasticSearch Index be used to calculate all metrics and do visualization in Kibana

This is a general question. I find that not all metrics are implementable from one index in ElasticSearch. Case in point if we have a dataset that is as follows
(ts = timestamp long integer)
(email = User email who logs in)

email_1, ts1
email_2, ts2
email_3, ts3
email_1, ts2
email_4, ts5
email_2, ts6
email_1, ts7
and so on....

A metric like count daily unique new users is easy using a Kibana Transform - group by email, aggregate min(ts) and build a histogram(daily)

However a metric like count daily unique RETURN users is relatively tougher. I have not been able to do it using the index based on the dataset above. So what I do is crunch data on the backend to figure out return users daily and create another dataset and bulk load it into a new index in ElasticSearch.

Another tricky metric is average time between repeated user logins. From the above dataset it will be

email_1, ((ts2 - ts1) + (ts7 - ts2))/2

email_2, (ts6 - ts2) / 1

email_3, N/A (they have not logged back again yet)

email_4, N/A (they have not logged back again yet)

For now my flow is the following

Step 1 : Use Apache Drill to wrangle the data into a CSV
Step 2 : Python script to JSON-ize the CSV
Step 3: Bulk Load the JSONs into a new a new index in ElasticSearch
Step 4: Use this new index to create visualization in Kibana

I have a cron or some scheduler to periodically do Step 1, Step 2, Step 3.

So my question is how do you all implement complex metrics such as these?


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.