The events I'm receiving from AWS/RDS do not match the shape I expect after reviewing the docs, so I'm requesting some clarification about the expected behavior. In particular, is it expected that some events will be generated without any association to a db instance, cluster, or engine name? Am I misunderstanding something about scraping rds metrics through cloudwatch here? I'm seeing a number of such 'fragmented' events, which makes it difficult to aggregate these metrics in a useful way for e.g. graphing and alerting. I don't see any errors in the metricbeat logs, including at debug level. The examples below are illustrative:
No identifiers - no way to tell which instance/cluster/engine this is for:
{
"_index": "metricbeat-2020.07.10-000017",
"_type": "_doc",
"_id": "jVnUOnMB4azz_707Q0oN",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2020-07-10T22:24:11.004Z",
"cloud": {
"provider": "aws",
"region": "us-east-1",
...
}
},
"ecs": {
"version": "1.5.0"
},
"host": {
...
},
"agent": {
...
},
"event": {
"dataset": "aws.rds",
"module": "aws",
"duration": 4126072412
},
"metricset": {
"name": "rds",
"period": 60000
},
"service": {
"type": "aws"
},
"aws": {
"rds": {
"throughput": {
"insert": 0.49994167347142837,
"network_receive": 97687.58228755229,
"ddl": 0,
"select": 88.28969953505424,
"update": 0,
"network": 156748.4459110378,
"network_transmit": 59060.86362348549,
"dml": 0.49994167347142837,
"commit": 0.49994167347142837,
"delete": 0
},
"login_failures": 0,
"db_instance.class": "db.r5.xlarge",
"deadlocks": 0,
"freeable_memory.bytes": 5594017792,
"aurora_bin_log_replica_lag": 0,
"disk_usage": {
"bin_log.bytes": 0
},
"cache_hit_ratio.result_set": 30.53247734138973,
"free_local_storage.bytes": 79721652224,
"engine_uptime.sec": 9770291,
"database_connections": 98,
"cpu": {
"total": {
"pct": 0.04
}
},
"latency": {
"commit": 2.5976,
"ddl": 0,
"dml": 0.14663333333333334,
"update": 0,
"delete": 0,
"insert": 0.14663333333333334,
"select": 0.14652510381275952
},
"cache_hit_ratio.buffer": 100,
"transactions": {
"active": 0,
"blocked": 0
},
"queries": 107.86992303335221
}
}
},
"fields": {
"@timestamp": [
"2020-07-10T22:24:11.004Z"
]
},
...
}
Partial identifiers - DBInstanceIdentifier but no cluster identifier or engine name (it's part of an aurora-mysql cluster):
{
"_index": "metricbeat-2020.07.10-000017",
"_type": "_doc",
"_id": "-lnUOnMB4azz_707Q0op",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2020-07-10T22:24:11.004Z",
"cloud": {
"availability_zone": "us-east-1b",
"provider": "aws",
"region": "us-east-1",
...
}
},
"metricset": {
"name": "rds",
"period": 60000
},
"event": {
"dataset": "aws.rds",
"module": "aws",
"duration": 4127377448
},
"service": {
"type": "aws"
},
"ecs": {
"version": "1.5.0"
},
"host": {
...
},
"agent": {
...
"type": "metricbeat",
"version": "7.8.0",
...
},
"aws": {
"rds": {
"throughput": {
"ddl": 0,
"dml": 0.5,
"select": 38.1,
"update": 0,
"commit": 0.5,
"delete": 0,
"network_receive": 31193.1596579829,
"network": 66075.15375768789,
"insert": 0.5,
"network_transmit": 34881.994099704985
},
"latency": {
"update": 0,
"delete": 0,
"insert": 0.1658,
"select": 0.168831583552056,
"commit": 2.9006333333333334,
"ddl": 0,
"dml": 0.1658
},
"aurora_bin_log_replica_lag": 0,
"db_instance": {
"arn": ...
"class": "db.r5.large",
"identifier": "v1bd02mya01",
"status": "available"
},
"free_local_storage.bytes": 29452050432,
"database_connections": 48,
"login_failures": 0,
"queries": 51.214105961368595,
"deadlocks": 0,
"db_instance.identifier": "v1bd02mya01",
"freeable_memory.bytes": 4686135296,
"disk_usage": {
"bin_log.bytes": 0
},
"cpu": {
"total": {
"pct": 0.07
}
},
"cache_hit_ratio.buffer": 100,
"cache_hit_ratio.result_set": 12.045554095488392,
"engine_uptime.sec": 13412709,
"transactions": {
"active": 0,
"blocked": 0
}
}
}
},
"fields": {
"@timestamp": [
"2020-07-10T22:24:11.004Z"
]
},
...
}
How could I go about aggregating say queries-ps, open connections, or other db-level metrics, grouped by instance, and filtered by say a particular engine or cluster with this data? There doesn't appear to be any association between the measurements at these different dimensions. I would be happy to provide any additional information you might need. Any input you might have would be most appreciated, thank you!