Metricbeat 8.7.0 mysql 8, performance module, 1400+ event rate/sec, 16gb an hour

Topic title really says it all.
My database has well over 100 databases, most are wordpress - so that's at least 15 tables per.
And the module runs the following query:

SELECT object_schema, object_name, index_name, count_fetch
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE count_fetch > 0

which yields 17,450+ results. Which metricbeat appears to be dispatching as summed valuesets per object_schema (but guessing). Which is a massive amount of index documents per interval (10second default).
I am presuming that is the cause of the incredible data frequency.

I'm just curious if that is expected behaviour? For me it has rendered both the beat, and the kibana dashboards, entirely unusable. 400 gigabytes of data per day is absolutely insane for a single beat source. And it's killing the CPU on both machines (sender/receiver).

Any recommendations? Perhaps I can adjust the module query to pre-group and sum the results to reduce to an "average of performance" across all schemas, for instance?
Or is this unexpected behaviour and something is wrong?

Simple solution for me, so far, is to just reset the statistics daily over a cron job.

Truncating the counters daily means metricbeat is feeding valuable-data only at a daily level now, which isn't perfect (hard drop at midnight, lacking value over time). Tables/Indexes that perform poorly beyond a daily span of cumulative measurement are now lost in visibility.
But it tames the beast slightly. Down to 1-10 requests a second over a 5 min span analysis.

(i prefer this approach - event_scheduler , rather than a cron mysql cmd and exposing password to crontab or .env)

/* SHOW VARIABLES LIKE "event_scheduler"; */
/* SET GLOBAL event_scheduler = ON; */

CREATE EVENT reset_performance_index_io_waits_daily
ON SCHEDULE EVERY 1 DAY
STARTS '2023-04-26 00:00:00'
DO
TRUNCATE TABLE `performance_schema`.table_io_waits_summary_by_index_usage

It still grows at a rapid pace, due to shear size of tables and indexes.
So I think perhaps an alternative would be to adjust the manifest.yml for modules/mysql/performance to read like so:

SELECT object_schema, object_name, index_name, count_fetch
          FROM performance_schema.table_io_waits_summary_by_index_usage
          WHERE count_fetch > 0 
          /*new:: */ AND index_name IS NULL

That way we are feeding io waits only for tables that aren't performing well due to lack of index (and/or are insert statements).
There's a value cost either way, but at least it's stable for me now.

Open to thoughts by anyone on this, anyone run into this before or no?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.