Create a visualization with a metric expressing the ratio between two other aggregations


(Carlo Arnaboldi) #1

Greetings.

Let's say that I have this very simple index, where each document is made by 3 columns:

Col.A -> can be 1,2 or 3
Col.B -> contains a number between 1 and 1000
Col.C -> like Col.B, it contains a number between 1 and 1000

Now, I' create a vertical bar visualization. First of all I split the buckets according to ColA value.
Then I create 2 metrics (one with SUM(Col.B) and the other with SUM(Col.C)), each with it's own axis.
This works perfectly.

The problem: I need to create a third metric (so that for each bucket I now have 3 vertical bars, not just 2).
This third metric should contain the ratio between the previous 2 metrics (SUM(Col.B) / SUM(Col.C)), FOR EACH BUCKET.

I'm unable to do that in Kibana v.6.1.1. Digging around I found this post which seems to be related to my problem:

but it looks like that for vertical bar visualization it still isn't possible. In the last posts people are suggesting to use external plugins in order to achieve this. To me, this is a very basic requirement, isn't this supported by kibana (out-of-the-box) for real?

Thanks!


(Thomas Neirynck) #2

hi @Carlo_Arnaboldi,

if this is time-series data, you can useTimelion to do that, which has a command-line interface to build graphs.

For more info, see here https://www.elastic.co/blog/timelion-timeline and here https://www.elastic.co/guide/en/kibana/current/timelion-inline-help.html. You'll be using the .divide() function.

The new v6.2 version of Kibana also has support for the Vega-language. This is an experimental feature, but this will allow you to use the Vega-language to express these ratios as well. To get started, see https://www.elastic.co/guide/en/kibana/current/vega-graph.html

There's an outstanding existing feature request, https://github.com/elastic/kibana/issues/4707, to do this with the built-in ES-scripting capabilities as well.


(Carlo Arnaboldi) #3

thanks for the kind reply: I've been experimenting with Vega plugin and I've come up with this visualization (I past it here so that if somebody has the same requirements maybe they can take this as an hint):

{
"$schema": "https://vega.github.io/schema/vega/v3.0.json",
"width": 300,
"height": 240,
"padding": 5,
data: {
"name":"table",
"url": {
"index": "myIndex",
"body": {
"size": 0,
"_source": {
"excludes": []
},
"aggs": {
"3": {
"terms": {
"field": "profile.keyword",
"size": 10,
"order": {
"1": "desc"
}
},
"aggs": {
"1": {
"sum": {
"field": "producedQuantityTotal"
}
},
"2": {
"sum": {
"field": "durationInMinutesFermiTotale"
}
},
"ratio":{
"bucket_script":{
"buckets_path":{
"quantity": "1",
"durationInMinutes": "2"
},
"script": "params.durationInMinutes / params.quantity * 1000"
}
}
}
}
},
"stored_fields": [
"*"
],
"script_fields": {},
"docvalue_fields": [
"data"
],
"query": {
"bool": {
"must": [
{
"match_all": {}
},
{
"range": {
"data": {
"gte": "now-2y",
"lt": "now"
}
}
}
],
"filter": [],
"should": [],
"must_not": []
}
}
}
},
"format": {
"type": "json",
"property": "aggregations.3.buckets"
}
},
"scales": [
{
"name": "xscale",
"type": "band",
"domain": {"data": "table", "field": "key"},
"padding": 0.05,
"round": true,
"range":"width"
},
{
"name": "yscale",
"domain": {"data": "table", "field": "ratio.value"},
"nice": true,
"range": "height"
},
{
"name": "yscale2",
"domain": {"data": "table", "field": "1.value"},
"nice": true,
"range": "height",
}
],
"axes": [
{ "orient": "bottom", "scale": "xscale" },
{ "orient": "left", "scale": "yscale" },
{"orient": "right", "scale": "yscale2"}
],
"marks": [
{
"type": "rect",
"from": {"data":"table"},
"encode": {
"enter": {
"x": {"scale": "xscale", "field": "key"},
"width": {"scale": "xscale", "band": 0.5},
"y": {"scale": "yscale", "field": "ratio.value"},
"y2": {"scale": "yscale", "value": 0}
},
"update": {
"fill": {"value": "steelblue"}
},
"hover": {
"fill": {"value": "red"}
}
}
},
{
"type": "rect",
"from": {"data":"table"},
"encode": {
"enter": {
"x": {"scale": "xscale", "field": "key"},
"width": {"scale": "xscale", "band": 0.2},
"y": {"scale": "yscale2", "field": "1.value"},
"y2": {"scale": "yscale2", "value": 0}
},
"update": {
"fill": {"value": "lightgreen"}
},
"hover": {
"fill": {"value": "green"}
}
}
},]
}

The only problem I'm experiencing is that when I put this visualization in a dashboard, I'm unable to apply filters of any kind. With standard visualizations, I would see on the top a "filter" bar, which would apply filters to all the visualizations in the dashboards. This same bar is missing when I include this visualization in a dashboard. Hence, my questions:

  1. What should I do in order to supply the end-user of this visualization with the capability of filtering the data? I can't ask them to look at the visualization "vega source code" and figure out by themeselves how to apply a different filter altering the DSL query accordingly.

EDIT: I Found an answer to this first question: by following the kibana vega plugin instructions, I just had to set a timefield field on my index pattern (i just recreated it and pointed to the correct timefilter), then, I added the following lines to the "url" section of my vega code (on top of the other "url" content):

    "%context%": true,
    "%timefield%": "data",

Note that my %timefield% is named "data" since that's the name of the property which contains the timestamp for each document (since _timestamp has been deprectated in 5.5 version..). By doing so, now I'm able to filter documents passed to the visualization through the timefilter on the top right of the page. I'm also able to specify other conditions by typing them in the "Search" textbox, as long as they respect the query string language. It's not so handy but that's something. Unfortunately this doesn't solve my second question..

  1. Moreover, if possible, I'd like to supply them with the same (or similar) functionality offered by the standard historgram visualization, like changing the bucket aggregation on the fly, changing the source property which determines the first metric (the second one is the ratio between 2 specific fields, calculated on each bucket, I understand that this may not be expressed as a visualization parameter). Do I really have to create a full-fledged plugin to achieve this?

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.