Hi, welcome to the forums! Based on your sample chart, you should use the histogram aggregation on the X axis, on the score field, and the cardinality/unique count metric on the Y axis, on the user_id field. Your histogram interval determines how many X axis values you will see, like in the sample chart you have it set to 2.
To do this in Lens, you would use the function called Ranges- this function sets an interval automatically, but you can override the interval if you want. On the vertical axis you would choose Unique count
To do this in the Bar Chart, you would choose Histogram for your X axis, then Cardinality for your Y axis.
Thanks for the reply, I have tried this but this skips the average step altogether. I want to bucket the documents based on distinct user_id and then average each bucket's score value. I then want to show the count distribution of the scores.
It depends on how many unique user IDs you have. Is it less than 10k? If so, then you can probably do that in a single request. Is it more than 10k? Then you'll need to pre-compute the numbers you want. Pre-computing also works for smaller datasets.
a. One way to precompute is using Elasticsearch transforms. This will create a pivoted index which lets you transform the documents you shared into documents like { count: 100, score: 10.0 } using a Terms aggregation on score, with count using the Terms.count. This lets you create a histogram on the score field where you can show sum of count. You can't put the histogram inside the ES Transform because the data format won't work in Kibana.
b. You can build a Vega chart that does this transformation for you. You will need to query all the aggregated data, using a Terms aggregation on user_id, and then an Average of Score as your metric. Set the size of the terms aggregation to 65000, which is the max buckets you can fetch. In Vega, you can then apply a second level of aggregation to make a histogram chart. Follow the Vega docs to get started with this.
Is there any way I can use visual builder to do something like this?
The general question I am trying to answer is, how many users are scoring well over time? Several scores for the same user can come in at the same time, which is why I want to average them.
In visual builder, can I plot a single line that shows the count of users with average scores above a certain threshold?
If you really want to calculate that kind of chart without changing your data format, you need to use Vega. Nothing else is as powerful in Kibana.
You can't do this in TSVB with your data format because there are too many steps involved in the presentation of the data. You can do this in TSVB if you first transform the data to have a per-user average.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.