Simple request here. We want to migrate more and more basic analytics team needs into ELK, but one key question I'm stumped on is: can you visualize funnels in Kibana? Specifically drop off rates between pages? We'll have pretty vanilla log data coming in, with some fields filled in by values in cookies and query params in the request URL (parsed out by logstash). So we'll have all the usual, IP/Timestamp/Page etc that a normal web log would have.
OK so we would have to have some unique identifier to tie events over time. For Kibana's sake we could just default to using IP address (as it is fairly generic, will fulfill most use cases). For our particular use case we'll have a GUID from a cookie identifying the user. If the user has authenticated of course, then we can use the user value from the log.
But yeh, general question, has anyone done this before ? For our analytics team, this is currently a moderate use case of Mixpanel.
Could you share a sample visualization from Mixpanel you're trying to replicate?
Sure thing. It might look something like this, where it tracks some set of unique identifiers from a Home page through to Checkout page (with intermediary pages in between). It allows us to see where the highest drop-off rates are ("pain points") in the flow.
Sorry about the delay in the response.
So this is interesting - it seems that you should be able to do a bar chart by using a terms aggregation in a vertical bar chart. You might have to tag your documents with exact values you want to see on the X-Axis using Logstash - Kibana won't magically know what is "Home Page" vs "Results" and it does not have a grouping mechanism - that data needs to live in ES.
The conversion percentages you won't be able to get with a bar chart, so that might benefit from a new visualization type. Feel free to raise it on Github, if this is important to you! https://github.com/elastic/kibana/issues
Ah well its actually more complex than this.
In my first example, this is how it would be broken down algorithmically:
- Bucket everyone (lets say they are identified by IP Address), that hit the Home page into group P1.
- Now from P1, take everyone that subsequently hit the Results page (after the Home page) and put them into P2.
- Now from P2, take everyone that subsequently hit the Detail page (after the Home page -> Results) and put them into P3.
- Now from P3, take everyone that subsequently hit the Checkout page (after the Home page -> Results -> Detail) and put them into P4.
Note that Pn is conditional on P(n-1), for n > 1.
Now graph P1, P2, P3, P4 as a histogram.
I think we will be able to execute this kind of query with the Pipeline Aggregations -- https://www.elastic.co/guide/en/elasticsearch/reference/2.0/search-aggregations-pipeline.html -- in ES 2.0?
This is an interesting use-case which we have seen come up a couple of times before. However, it is not a good fit for aggregations, because it requires putting every entity into a separate bucket, which is prohibitively expensive for aggregations.
The other way to approach this would be to build a entity-centric index which contains a document for each person (IP Address) and then store the pages the person has visited in that document. That way, when you want to aggregate across this data you'll have one document per person rather than multiple events per person. Once you have the entity centric index, you could use the filter aggregation to select the people who have visited the 'Home Page' and then have a sub aggregation which is a filter aggregation for people who have visited page 2 and so on. This would give you the funnel statistics.
Plotting that in Kibana however would be tricky as Kibana cannot select different bars on the bar chart from different aggregations. There may be an argument for having a dedicated 'funnel' aggregation which wraps this filtering functionality up in a single aggregation in the future.
Ah this is great. Thank you for the suggestions. Great writeup on entity-centric indices.
Am I not reading the pipeline aggregation correctly? Will it do something similar to this?
HI, would you please share an example how the entity-centric index's document structure or mapping be like?
For an instance , a user Session-A's visiting record:
Date-1 Page A -> Page B -> Page C
Date-2 Page B -> Page D -> Page A
How would this document be like?
I recently developed a visualization plugin for funnels.
It works on buckets, so if you can express your data as aggregations, you should be able to display it.
Hope it helps..