Metricbeat aws cpu total pct missing

Vinicios_Grein · November 27, 2019, 4:07pm

Hi,
I've configurated aws metricbeat on kibana and have some machines on there, about 20.
Only 3 of them don't register the field aws.ec2.cpu.total.pct, all the others yes.
Are there some configuration on aws or on a config file than forces to send the cpu information?

Mario_Castro · November 27, 2019, 4:40pm

Hi @Vinicios_Grein

That really looks like some network issue on AWS, maybe they are in a different network group. Anyways I'm summoning @Kaiyan_Sheng that maybe has more info about this.

Kaiyan_Sheng · November 27, 2019, 4:57pm

@Vinicios_Grein Could you show us your metricbeat config for aws module please? I don't think we got any special config there hmm. Did you check on AWS cloudwatch metrics portal to see if there are CPU metrics there at similar timestamp with the same instances?

Vinicios_Grein · November 27, 2019, 5:20pm

Hi @Kaiyan_Sheng ,
my cfg file is basecaly the default, with my credentials:

module: aws
period: 300s
metricsets:
- ec2
  access_key_id: '{AWS_ACCESS_KEY_ID:"my_ak"}' secret_access_key: '{AWS_SECRET_ACCESS_KEY:"my_ak"}'
  session_token: '{AWS_SESSION_TOKEN:""}' default_region: '{AWS_REGION:sa-east-1}'

From this line to botton I have put a "#" to comment.

I'll check it out AWS cloudwatch for issues too

Kaiyan_Sheng · November 27, 2019, 7:06pm

Yeah ok, thanks! Definitely check cloudwatch aws portal to see if the AWS is reporting the missing CPUUtilization metrics.

Vinicios_Grein · November 27, 2019, 7:40pm

Thanks @Kaiyan_Sheng

Aws Checked. Ther are the metrics for all instances.

I've copied this from kibana logs:

The left side is instance that is cpu is missing, but have some fields that don't have on right side, where exists cpu metric. Where this fields are setted?

Kaiyan_Sheng · November 27, 2019, 11:21pm

Hmm very interesting, there might be a bug in the code then. I will try to reproduce it on my side! Thanks for verifying.

In the mean time, maybe you can give cloudwatch module a try to collect ec2 metrics:

- module: aws
  period: 300s
  metricsets:
    - cloudwatch
  metrics:
    - namespace: AWS/EC2
      tags.resource_type_filter: ec2:instance
      statistic: ["Average", "Maximum"]

Vinicios_Grein · November 28, 2019, 2:16pm

Hi @Kaiyan_Sheng,
I have activated cloudwatch on aws.yml and desactivated EC2 metrics.
On kibana I've filtered 3 instances. The first one with problem (same instance from ec2), that doens't appear on log, and the other ones appears.

Kaiyan_Sheng · December 2, 2019, 5:12pm

Hi @Vinicios_Grein, sorry I just got back from thanksgiving break. Will continue here to try reproduce the issue.

With cloudwatch metricset, do you mean you are seeing the same issue? With metrics missing from one specific instance?

Vinicios_Grein · December 2, 2019, 5:30pm

Hi @Kaiyan_Sheng,
that's ok

I was trying another think, but no success.

When I tried ec2 metric, 3 instances between 20 has no "ec2.cpu.total.pct" filed on log.
I've activated cloudwatch metrics as you suggested, but in this same 3 instances the field is missing, in this case "aws.metrics.CPUUtilization".

I tried to configure a new user on aws with this settings:

Kaiyan_Sheng · December 2, 2019, 9:16pm

Hmm Im sorry I can't reproduce it on my aws account hmm I have 12 EC2 instances spread in several different regions and all of their cloudwatch metrics get collected.

Are these 3 instances from the same region in your account? Are they in running state?

Vinicios_Grein · December 3, 2019, 4:09pm

Yes, they are "running" and region is "sa-east-1".
I've tried to reinstall the metricbeat on my instance but the result was the same.
Now I tryed to install in another instance. One of this 3 instances start to show the metrics correctely, but the other ones no. All the other instances continue appearing correctely.
Are there any limit of number of instances per metricbeat control?
How can I configure on ec2 or cloudwatch specifics instances ids?

Vinicios_Grein · December 3, 2019, 5:31pm

@Kaiyan_Sheng

I tried another think. Started cloudwatch in metricbeat with all instances with problem and some others that's normal. In this time all fiels appeared. I think there are some kind a limit on monitoring. I've counted and exists 36 instances in my accont, I told you 20, sorry.
Can you confirm if there exists this limit and if so, how much instances can the metricbeat support?

Kaiyan_Sheng · December 3, 2019, 6:35pm

@Vinicios_Grein Thank you so much for investigating this!! Sounds like we hit a limit hmmm

Maybe because the collection period is shorter than how long it takes to collect all from the instances. What is the period you set right now for metricbeat? If that's the case, several metricbeats running in parallel should help with specific regions for each metricbeat to collect from. Or maybe just specify specific regions for different sessions in aws.yml. For example:

- module: aws
  period: 300s
  metricsets:
    - ec2
  credential_profile_name: test-mb
  regions:
    - us-east-1
    - us-east-2
- module: aws
  period: 300s
  metricsets:
    - ec2
  credential_profile_name: test-mb
  regions:
    - us-west-1
    - us-west-2
- module: aws
  period: 300s
  metricsets:
    - ec2
  credential_profile_name: test-mb
  regions:
    - sa-east-1
    - ap-southeast-1

This is not running metricbeats in parallel. If you want to try run multiple metricbeats, you can download 3 metricbeat binaries, and then separate the config above into 3 different aws.yml in 3 metricbeats. I think that will solve the problem unless the limit is on AWS side. This is not an ideal solution, I'm looking into it for a better way to solve this. Thanks!!

Vinicios_Grein · December 3, 2019, 7:54pm

The period is setted to 300s, I tried 60s and 600s.
All the instances are in "sa-east-1".

I'll configure by cloudwatch all necessary instances to see results.

Kaiyan_Sheng · December 3, 2019, 10:26pm

Thank you so much for trying!! I just created a github issue to track this problem: https://github.com/elastic/beats/issues/14926

I'm currently trying to reproduce it in my aws test account. I have 53 instances created but still haven't seen this issue. Will create more in different regions and see if that changes things.

Kaiyan_Sheng · December 3, 2019, 10:33pm

Sorry @Vinicios_Grein just want to make sure I didn't miss this.

I'm seeing one of the instances from previous collection period shows empty aws.ec2.cpu.total.pct value. But it reports other values like aws.ec2.cpu.credit_balance just fine.

When you see an empty aws.ec2.cpu.total.pct, do you see values for other metrics for the same instance?

Kaiyan_Sheng · December 4, 2019, 1:11am

One more question, any missing EC2 instances in your environment are ECS related? Not sure if it matters but just checking. Thanks!

Vinicios_Grein · December 4, 2019, 1:40am

Thans @Kaiyan_Sheng so much until now, you are the best!

There is no ECS related, just EC2.

Here is a full log from instance:

Blockquote
{
"_index": "metricbeat-7.4.2-2019.12.03-000002",
"_type": "_doc",
"_id": "i1R-zm4BBQOQ-PUcrajf",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2019-12-04T01:20:23.957Z",
"service": {
"type": "aws"
},
"ecs": {
"version": "1.1.0"
},
"host": {
"name": "ip-",
"hostname": "ip-",
"architecture": "x86_64",
"os": {
"platform": "ubuntu",
"version": "16.04.5 LTS (Xenial Xerus)",
"family": "debian",
"name": "Ubuntu",
"kernel": "4.4.0-1098-aws",
"codename": "xenial"
},
"id": "ec2e96c3921ea11e248df16a6b71f076",
"containerized": false
},
"agent": {
"type": "metricbeat",
"ephemeral_id": "0c552575-a6c0-407b-bd09-3295eb10d7b3",
"hostname": "ip-",
"id": "2c7abdac-9a7d-4562-b0b8-76637a1b0dc9",
"version": "7.4.2"
},
"cloud": {
"instance": {
"id": "i-02bd802bbd4002504"
},
"machine": {
"type": "c5.xlarge"
},
"availability_zone": "sa-east-1a",
"provider": "aws",
"region": "sa-east-1"
},
"event": {
"dataset": "aws.ec2",
"module": "aws",
"duration": 14851523222
},
"metricset": {
"name": "ec2",
"period": 600000
},
"aws": {
"ec2": {
"status": {
"check_failed_system": 0,
"check_failed_instance": 0
},
"cpu": {
"total": {}
},
"instance": {
"state": {
"name": "running",
"code": 16
},
"monitoring": {
"state": "disabled"
},
"core": {
"count": 2
},
"threads_per_core": 2,
"public": {
"ip": "",
"dns_name": "ec2-.sa-east-1.compute.amazonaws.com"
},
"private": {
"dns_name": "ip-.sa-east-1.compute.internal",
"ip": ""
},
"image": {
"id": "ami-10186f7c"
}
},
"diskio": {
"read": {},
"write": {}
},
"network": {
"in": {
"bytes": 4803463.2,
"packets": 18453.5,
"bytes_per_sec": 16011.544,
"packets_per_sec": 61.51166666666666
},
"out": {
"packets": 20906.9,
"bytes": 6373195.7,
"bytes_per_sec": 21243.985666666667,
"packets_per_sec": 69.68966666666667
}
}
},
"tags": {
"fin_tipo": "PRD",
"fin_aplicacao": "App",
"Name": "Cloud Ubuntu",
"monitoramento": "sim",
"Aplicacao": "Sis"
}
}
},
"fields": {
"@timestamp": [
"2019-12-04T01:20:23.957Z"
]
},
"highlight": {
"cloud.instance.id": [
"@kibana-highlighted-field@i-02bd802bbd4002504@/kibana-highlighted-field@"
]
},
"sort": [
1575422423957
]
}

And here one that's fine:

Blockquote
{
"_index": "metricbeat-7.4.2-2019.12.03-000002",
"_type": "_doc",
"_id": "kFWHzm4BBQOQ-PUc2STq",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2019-12-04T01:30:23.957Z",
"cloud": {
"region": "sa-east-1",
"instance": {
"id": "i-08708303e4916fc15"
},
"machine": {
"type": "m5.2xlarge"
},
"availability_zone": "sa-east-1c",
"provider": "aws"
},
"metricset": {
"period": 600000,
"name": "ec2"
},
"event": {
"module": "aws",
"duration": 15019251360,
"dataset": "aws.ec2"
},
"aws": {
"tags": {
"Aplicacao": "",
"Name": "Cloud6",
"fin_tipo": "PRD",
"fin_aplicacao": "App2"
},
"ec2": {
"cpu": {
"total": {
"pct": 4.6
}
},
"instance": {
"public": {
"ip": "",
"dns_name": "ec2-.sa-east-1.compute.amazonaws.com"
},
"private": {
"ip": "",
"dns_name": "ip-.sa-east-1.compute.internal"
},
"image": {
"id": "ami-0e4e25c13f561aca0"
},
"state": {
"name": "running",
"code": 16
},
"monitoring": {
"state": "disabled"
},
"core": {
"count": 4
},
"threads_per_core": 2
},
"diskio": {
"write": {},
"read": {}
},
"network": {
"in": {
"packets_per_sec": 22.048333333333332,
"packets": 6614.5,
"bytes": 1583963.4,
"bytes_per_sec": 5279.878
},
"out": {
"packets": 5710.8,
"packets_per_sec": 19.036
}
},
"status": {
"check_failed_system": 0,
"check_failed": 0,
"check_failed_instance": 0
}
}
},
"service": {
"type": "aws"
},
"host": {
"name": "ip-",
"os": {
"kernel": "4.4.0-1098-aws",
"codename": "xenial",
"platform": "ubuntu",
"version": "16.04.5 LTS (Xenial Xerus)",
"family": "debian",
"name": "Ubuntu"
},
"id": "ec2e96c3921ea11e248df16a6b71f076",
"containerized": false,
"hostname": "ip-",
"architecture": "x86_64"
},
"agent": {
"ephemeral_id": "0c552575-a6c0-407b-bd09-3295eb10d7b3",
"hostname": "ip-",
"id": "2c7abdac-9a7d-4562-b0b8-76637a1b0dc9",
"version": "7.4.2",
"type": "metricbeat"
},
"ecs": {
"version": "1.1.0"
}
},
"fields": {
"@timestamp": [
"2019-12-04T01:30:23.957Z"
]
},
"highlight": {
"cloud.instance.id": [
"@kibana-highlighted-field@i-08708303e4916fc15@/kibana-highlighted-field@"
]
},
"sort": [
1575423023957
]
}

Vinicios_Grein · December 4, 2019, 12:39pm

Hi @Kaiyan_Sheng
Thank you so much for all.

Using cloudwatch metric with "dimensions" is working for me.
I've put on aws.yml all the instances that I needed, like this:

- module: aws
  period: 60s
  metricsets:
    - cloudwatch
  access_key_id: "my_keyID"
  secret_access_key: "my_AK"
  default_region: '${AWS_REGION:sa-east-1}'
  metrics:
    - namespace: AWS/EC2
      name: ["CPUUtilization"]
      tags.resource_type_filter: ec2:instance
      statistic: ["Average", "Maximum"]
      dimensions:
        - name: InstanceId
          value: i-018716881a96f13e4 
    - namespace: AWS/EC2
      name: ["CPUUtilization"]
      tags.resource_type_filter: ec2:instance
      statistic: ["Average", "Maximum"]
      dimensions:
        - name: InstanceId
          value: i-08751c22a26621e72

Topic		Replies	Views
I can't get system.cpu.total.pct metrics from my metricbeats Beats metricbeat	10	2574	August 24, 2021
Metricbeat CPU Visualization missing Beats metricbeat	14	6960	May 23, 2018
Missing fields from Metricbeat AWS Module Beats metricbeat	7	364	December 7, 2022
Topbeat/metricbeat is not showing correct value for proc.cpu.user_p Beats	4	779	October 9, 2016
Metricbeat not able to display cpu status Beats metricbeat	11	2558	May 22, 2018

Metricbeat aws cpu total pct missing

Related topics