Make visualizations (metric and table/topn) with weighted average

Hi there,

I'm struggling trying to visualize a Metric visualization with a given weighted average over a certain timespan. Let's consider the following scenario:

index: sample_shops_data
fields of interest: day (date), shop_id (text), shortmessage (text)
sample log: 2020-11-11 | X | foo
possible values of shop_id: (X, Y, Z).
possible values of shortmessage: (foo, bar, whatever)

Let's consider the following situation concerning some collected logs, all of them with shortmessage: foo

2020-11-06: 3 affected shop_ids, 47 total events (X=14, Y=14, Z=19)
2020-11-07: 0 affected shop_ids, 0 total events (X=0, Y=0, Z=0)
2020-11-08: 2 affected shop_ids, 55 total events (X=23, Y=32, Z=0)
2020-11-09: 2 affected shop_ids, 29 total events (X=15, Y=0, Z=14)
2020-11-10: 3 affected shop_ids, 38 total events (X=11, Y=15, Z=12)
2020-11-11: 3 affected shop_ids, 30 total events (X=12, Y=6, Z=12)
2020-11-12: 3 affected shop_ids, 50 total events (X=21, Y=11, Z=18)

Average number of "foo" events by shop_id per day:
((47/3) + 0 + (55/2) + (29/2) + (38/3) + (30/3) + (50/3)) / 7 = 13.85

Obviously, other than weight the average dividing each daily total by the number of the shop_ids affected that day, it also has to consider those days where no shop_id has been affected (for example those days where no "foo" shortmessage has been ingested).

Furthermore, I'd like to have another visualization (Table or TopN for example) listing all those shop_ids whose average over the same timespan is greater than that 13.85. In this specific case:

  X --> (14+0+23+15+11+12+21)/7 = 13.71
  Y --> (14+0+32+0+15+6+11)/7 = 11.14
  Z --> (19+0+0+14+12+12+18)/7 = 10.71

none of them would be visualized since they're all below 13.85 threshold.

How can I build such visualizations? I guess using something like Bucket Script or Math in the TSVB but I cannot find the right settings. Possibly I might have to mix something like the Cardinality with the Overall Average or Moving Average and compute the weighted part with a Bucket Script, but I cannot find the right syntax nor the right Interval settings to set in the Panel Options to make it work.

Any ideas?
Thanks!

Ok with the help of some colleagues of mine I think we found a partial solution to this issue.

Let's consider the following scheme (something changed in terms of docs count but the general concept is always the same)

2020-11-06: 3 affected shop_ids, 141 total events (X=50, Y=42, Z=49)
2020-11-07: 0 affected shop_ids, 0 total events (X=0, Y=0, Z=0)
2020-11-08: 2 affected shop_ids, 55 total events (X=23, Y=32, Z=0)
2020-11-09: 2 affected shop_ids, 29 total events (X=15, Y=0, Z=14)
2020-11-10: 3 affected shop_ids, 38 total events (X=11, Y=15, Z=12)
2020-11-11: 3 affected shop_ids, 77 total events (X=37, Y=17, Z=23)
2020-11-12: 3 affected shop_ids, 50 total events (X=21, Y=11, Z=18)

((141/3) + 0 + (55/2) + (29/2) + (38/3) + (77/3) + (50/3)) / 7 = 20.57

X --> (50+0+23+15+11+37+21)/7 = 22.42
Y --> (42+0+32+0+15+17+11)/7 = 16.71
Z --> (49+0+0+14+12+23+18)/7 = 16.57

I had to add something in the ingestion phase to extrapolate only the date (YYYY-MM-dd) from the timestamp in order to use the group by in the TSVB.
This is what we managed to achieve

So what we did for the Metric in the TSVB was the following:

As you can see in the Data Settings we had to group by day to achieve what we wanted. At first we tried with the @timestamp but it's not feasible when the events differ by milliseconds (we'd need an unthinkable number in the Top setting of the Group by part).

As for the Table, we left the same settings and simply switch to TSVB Table view, grouping by the shop_id.keyword field.

Unfortunately, we didn't manage to find a way to filter the table rows leaving only those rows greater than what was computed by the Metric by at this point I don't think whether it is feasible or not.

If you find a way, please let me know.

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.