Azure Module Metricbeat

Hello,

I hope you are doing well!

I have ELK stack 7.9 running in an AKS alongside metricbeat (which is running as a daemonset).
Among the modules configured for metricbeat, I have the azure module which is configured as shown below:

- module: azure

      metricsets:

        - monitor

      enabled: true

      period: 300s

      client_id: '${AZURE_CLIENT_ID}'

      client_secret: '${AZURE_CLIENT_SECRET}'

      tenant_id: '${AZURE_TENANT_ID}'

      subscription_id: '${AZURE_SUBSCRIPTION_ID}'

      resources:

        - resource_query: "resourceType eq 'Microsoft.Compute/virtualMachines'"

          metrics:

          - name: ["*"]

            namespace: "Microsoft.Compute/virtualMachines"

It was working fine in the last days however, now I see the following error:

ERROR [azure monitor client] azure/client.go:114 error while listing metric values by resource ID /subscriptions/<subscription_id>resourceGroups/<resourcegroup_name>/providers/Microsoft.Compute/virtualMachines/<VM_name> and namespace Microsoft.Compute/virtualMachines: insights.MetricsClient#List: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadRequest" Message="Failed to find metric configuration for provider: Microsoft.Compute, resource Type: virtualMachines, metric: VM Cached Bandwidth Consumed Percentange, Valid metrics: Percentage CPU,Network In,Network Out,Disk Read Bytes,Disk Write Bytes,Disk Read Operations/Sec,Disk Write Operations/Sec,CPU Credits Remaining,CPU Credits Consumed,Per Disk Read Bytes/sec,Per Disk Write Bytes/sec,Per Disk Read Operations/Sec,Per Disk Write Operations/Sec,Per Disk QD,OS Per Disk Read Bytes/sec,OS Per Disk Write Bytes/sec,OS Per Disk Read Operations/Sec,OS Per Disk Write Operations/Sec,OS Per Disk QD,Data Disk Read Bytes/sec,Data Disk Write Bytes/sec,Data Disk Read Operations/Sec,Data Disk Write Operations/Sec,Data Disk Queue Depth,Data Disk Bandwidth Consumed Percentage,Data Disk IOPS Consumed Percentage,OS Disk Read Bytes/sec,OS Disk Write Bytes/sec,OS Disk Read Operations/Sec,OS Disk Write Operations/Sec,OS Disk Queue Depth,OS Disk Bandwidth Consumed Percentage,OS Disk IOPS Consumed Percentage,Inbound Flows,Outbound Flows,Inbound Flows Maximum Creation Rate,Outbound Flows Maximum Creation Rate,Premium Data Disk Cache Read Hit,Premium Data Disk Cache Read Miss,Premium OS Disk Cache Read Hit,Premium OS Disk Cache Read Miss,VM Cached Bandwidth Consumed Percentage,VM Cached IOPS Consumed Percentage,VM Uncached Bandwidth Consumed Percentage,VM Uncached IOPS Consumed Percentage,Network In Total,Network Out Total"

So I was wondering whether you have an idea what could be the issue source and help me solve it.

Thanks in advance.

Kind Regards

hi @wadhah, it is strange , fromt he message it just looks like all those metrics are not supported. The problem is that they are first retrieved so they should be supported.
Does it work if you restart the Metricbeat client?
Also, if you check in the Azure Portal at that resource level, do you see these metrics supported in the Monitor area?

Hello @MarianaD thanks a lot for you quick interaction.
Honestly, I am really lost here, because if I check in Kibana, I am getting both metrics and empty documents.
And I checked the metrics that I am getting and they look the same as on Azure Portal.
I restarted metricbeat as well, but nothing really changed: same logs as shown above.
Honestly, this behavior is really confusing especially that, as described in my post, it was working fine beforehand.

hi @wadhah, it seems that the metric VM Cached Bandwidth Consumed Percentange is no longer a supported metric in the Microsoft.Compute/virtualMachines namespace, although this was the case previously. We validate on the supportability of the metrics and throw an error. This is why you see both results for the rest of the metric and events with the error message. We will be looking into this further.

Thanks a lot @MarianaD for your help, I really appreciate it.
Okay then, maybe in this case it would be better to specify the needed metrics (provided by Azure), in the metricbeat configuration and in this case we would be able to escape this ERROR.

I have looked a bit more into this issue and created https://github.com/elastic/beats/issues/21218 with more information there, will follow up on that ticket. Meanwhile, I suggest using the drop_event processor to drop the empty events by matching maybe the error message https://www.elastic.co/guide/en/beats/metricbeat/current/drop-event.html

1 Like

@wadhah, the issue seems to have been fixed at the azure side, can you let us know if you are able to reproduce this issue, I am not able to see it anymore.

@MarianaD yes now it looks fine.
Thanks a lot for keeping an eye on this issue.