How do I make a Kibana histogram that plots the distribution of the average of key A for each distinct value of key B?

This applies to any Kibana graph but the one I am interested in is a Histogram.

My data is structured like this:

[
  { "user_id" : 3, "score" : 10 },
  { "user_id" : 1, "score" : 20 },
  { "user_id" : 2, "score" : 60 },
  { "user_id" : 1, "score" : 10 },
  { "user_id" : 2, "score" : 55 }
]

How do I make a Kibana histogram that plots the distribution of the average score for each distinct user_id ?

The graph that I'm looking for looks like this:

I was able to make this graph using the "Unique Count" aggregation on user_id , but it is incorrect because it does not average the values.

This question is identical to this one except I am interested in graphing the distribution of average "price" for each "color"

Hi, welcome to the forums! Based on your sample chart, you should use the histogram aggregation on the X axis, on the score field, and the cardinality/unique count metric on the Y axis, on the user_id field. Your histogram interval determines how many X axis values you will see, like in the sample chart you have it set to 2.

To do this in Lens, you would use the function called Ranges- this function sets an interval automatically, but you can override the interval if you want. On the vertical axis you would choose Unique count

To do this in the Bar Chart, you would choose Histogram for your X axis, then Cardinality for your Y axis.

1 Like

Hi,

Thanks for the reply, I have tried this but this skips the average step altogether. I want to bucket the documents based on distinct user_id and then average each bucket's score value. I then want to show the count distribution of the scores.

Thanks,

Eric

It depends on how many unique user IDs you have. Is it less than 10k? If so, then you can probably do that in a single request. Is it more than 10k? Then you'll need to pre-compute the numbers you want. Pre-computing also works for smaller datasets.

a. One way to precompute is using Elasticsearch transforms. This will create a pivoted index which lets you transform the documents you shared into documents like { count: 100, score: 10.0 } using a Terms aggregation on score, with count using the Terms.count. This lets you create a histogram on the score field where you can show sum of count. You can't put the histogram inside the ES Transform because the data format won't work in Kibana.

b. You can build a Vega chart that does this transformation for you. You will need to query all the aggregated data, using a Terms aggregation on user_id, and then an Average of Score as your metric. Set the size of the terms aggregation to 65000, which is the max buckets you can fetch. In Vega, you can then apply a second level of aggregation to make a histogram chart. Follow the Vega docs to get started with this.

Okay thanks for explaining that.

Is there any way I can use visual builder to do something like this?

The general question I am trying to answer is, how many users are scoring well over time? Several scores for the same user can come in at the same time, which is why I want to average them.

In visual builder, can I plot a single line that shows the count of users with average scores above a certain threshold?

If you really want to calculate that kind of chart without changing your data format, you need to use Vega. Nothing else is as powerful in Kibana.

You can't do this in TSVB with your data format because there are too many steps involved in the presentation of the data. You can do this in TSVB if you first transform the data to have a per-user average.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.