Aggregation on time series data (metricbeat and TSVB)

Hi all,

The scenarios is as follow:

We are using metricbeat to capture metrics from multiple Kubernetes clusters into a single Elasticsearch cluster, we have enriched the data by adding a label using the processor, by example using the following:

processors:
- add_labels:
          labels:
            kubernetes.cluster: ${cluster}

We are trying to capture overall resource utilization by each cluster, by example, we want to know the total number of cores, memory, data received and send either over time or the most recent value.

We have tried to use the TSVB visualization using Time Series, Metric and Table, but seems that we will need two levels of grouping to achieve this, one by node and then by the custom label that we are adding.

By example if I do it with only the kubernetes.node.name as group by term, it does work, but this does not give me the details for specific cluster, here a sample visualization:

If I do the aggregation using the custom label for the cluster, I am not able by example to calculate the average of the sum to then calculate the average number of cores used by cluster

Is there a way to achieve this or do I need the multiple grouping (which seems there is already an request in github)?

Thanks,

Zareh

To keep the thread on, does anybody has any idea on how to achieve this type of aggregation for a time series data?

Hi @zvazquez

Yes there is not two level aggregation in TSVB today.

How I / we generally solve this.

Create a dashboard with a control and in that control you use the field that contains your cluster ID.

If you add your TSVB visualizations to that dashboard and then use the cluster control to pick your cluster the tsvb will be filtered by that cluster and thus will be per cluster.

This is the way you'd build a dashboard with several different visualizations in metrics and then filter it by the cluster you were interested in. Otherwise every visualization would need to do the two levels this way you control that cluster selection at the dashboard level.

Hope that makes sense.

I have even done it for where clusters are then assigned for example an organization name or BU name you could make those controls multi-level.

The interesting thing about that is you could actually look at overall utilization across multiple clusters if you wanted to.

Hope this helps.

Thanks Stephen,

Something like that is what we end trying, but were you able to properly calculate a single metric and this work pretty nice for line charts by example if we want to visualize the average number of cores used by pod and if we put it on a stack line chart with could get a feeling of the total number of cores, of course if the number of top entries in the aggregation is large enough.

But by example if we want to display the same as metric, I want to know the current or more recent total number of cores used in the cluster which should be the the sum of the average of the nanocores used, where you able to calculate it? I was not able to make it work when the selected time frame is long, is not giving the right result, maybe I am missing something on TSVB. Where you able to create this type of metrics? How did you do it?

Thanks again,

Zareh

Ahh I see

In your first chart you need to add a Series Agg of type Sum after the fist Average. This will sum the average of the values within each bucket.

So it would be

Average of nanocores
Series Aggregation of Type Sum
Bucket Script Math
Group by Term node name.

That should give you the right graph.

Then if you switch to metric and use time range Last Value that should work.

BUT we just found a bug if you Do a Series Agg and want the metric for the entire time range.

But if you are looking for recent / last value that should work.
Do that and then check the metric

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.