Hello. I am just here to mirror this question (Format for multi-level aggregation in Vega) which is: How do you access an array of subbuckets in Vega/Vegalite via Kibana?
I have a query which is aggregated over a given date range. This creates an array of buckets. In each of these buckets, I have a terms aggregation (over a field which has only 2 possibilties) which generates another set of buckets, within which I perform some extended_stats. Here is what my response looks like:
{
"took": 43,
"timed_out": false,
"_shards": {
"total": 145,
"successful": 145,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 206,
"max_score": 0,
"hits": []
},
"aggregations": {
"time_buckets": {
"buckets": [
{
"key": 0,
"doc_count": 6,
"study_allocation": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 0,
"key_as_string": "false",
"doc_count": 4,
"pain_stats": {
"count": 4,
"min": 3,
"max": 3,
"avg": 3,
"sum": 12,
"sum_of_squares": 36,
"variance": 0,
"std_deviation": 0,
"std_deviation_bounds": {
"upper": 3,
"lower": 3
}
}
},
{
"key": 1,
"key_as_string": "true",
"doc_count": 2,
"pain_stats": {
"count": 2,
"min": 3,
"max": 3,
"avg": 3,
"sum": 6,
"sum_of_squares": 18,
"variance": 0,
"std_deviation": 0,
"std_deviation_bounds": {
"upper": 3,
"lower": 3
}
}
}
]
}
},
{
"key": 1,
"doc_count": 3,
"study_allocation": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 0,
"key_as_string": "false",
"doc_count": 2,
"pain_stats": {
"count": 2,
"min": 2,
"max": 3,
"avg": 2.5,
"sum": 5,
"sum_of_squares": 13,
"variance": 0.25,
"std_deviation": 0.5,
"std_deviation_bounds": {
"upper": 3.5,
"lower": 1.5
}
}
},
{
"key": 1,
"key_as_string": "true",
"doc_count": 1,
"pain_stats": {
"count": 1,
"min": 4,
"max": 4,
"avg": 4,
"sum": 4,
"sum_of_squares": 16,
"variance": 0,
"std_deviation": 0,
"std_deviation_bounds": {
"upper": 4,
"lower": 4
}
}
}
]
}
},
{
"key": 2,
"doc_count": 3,
"study_allocation": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 0,
"key_as_string": "false",
"doc_count": 3,
"pain_stats": {
"count": 3,
"min": 2,
"max": 4,
"avg": 3,
"sum": 9,
"sum_of_squares": 29,
"variance": 0.6666666666666666,
"std_deviation": 0.816496580927726,
"std_deviation_bounds": {
"upper": 4.6329931618554525,
"lower": 1.367006838144548
}
}
}
]
}
},
{
"key": 3,
"doc_count": 4,
"study_allocation": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 0,
"key_as_string": "false",
"doc_count": 3,
"pain_stats": {
"count": 3,
"min": 3,
"max": 5,
"avg": 3.6666666666666665,
"sum": 11,
"sum_of_squares": 43,
"variance": 0.8888888888888881,
"std_deviation": 0.9428090415820629,
"std_deviation_bounds": {
"upper": 5.552284749830792,
"lower": 1.7810485835025407
}
}
},
{
"key": 1,
"key_as_string": "true",
"doc_count": 1,
"pain_stats": {
"count": 1,
"min": 3,
"max": 3,
"avg": 3,
"sum": 3,
"sum_of_squares": 9,
"variance": 0,
"std_deviation": 0,
"std_deviation_bounds": {
"upper": 3,
"lower": 3
}
}
}
]
}
},
...etc
In vega-lite, i want to split the line chart into separate lines, one for each of the buckets in study_allocation
. I see no way in vega by which I can access array elements in an expression. For example, if I want to do a transform based on the extended_stats contained in the first bucket of the study_allocation, there seems to be no way to do that. As the linked post indicates, it says that the buckets are undefined. e.g. I want to do datum.study_allocation.buckets[0].pain_stats.avg
in a transform, but I am unable to. If I just want to do stats over the original set of buckets without the sub-aggregation, this works no problem.
Any advice? Am I structuring my query the wrong way? Can I Just make two separate queries and overlay them onto the same chart? Not sure the best way to go about this.
Cheers,
-G