Heartbeat high network utilization

I'm trying to figure out why Heartbeat seems to be using so much network bandwidth:

I have a single http monitor for an Elasticsearch endpoint that is scheduled every 10s. The endpoint returns about a 600 byte response and a 200 OK.

Here is my configuration:

    compression_level: 9

    heartbeat.monitors:
    - type: http
      id: sample
      name: sample
      hosts: ["https://************.us-east-1.aws.found.io:9243"]
      check.response.status: [200]
      schedule: '@every 10s'
      username: *****************
      password: **********************
    heartbeat.scheduler:
      limit: 10

    reload.enabled: false

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    #processors:
    #  - add_cloud_metadata: ~
    #
    #  - add_observer_metadata:
    #      cache.ttl: 5m
    #      geo:
    #        name: us-east-1
    #        location: 39.0437, -77.4875
    #        continent_name: North America
    #        country_iso_code: US
    #        region_name: N. Virginia
    #        region_iso_code: VA
    #        city_name: Ashburn

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
      ssl.verification_mode: none
      protocol: https
      compression_level: 9

I'm running docker.elastic.co/beats/heartbeat:7.9.1 in a K8s pod with no additional processors enabled, and I've added compression_level of 9 because I can afford the CPU cost.

CPU utilization is low (around 1millicore) and memory is around 20 MiB. However, the pod has relatively sustained network I/O of around 1.3 Mbps/transmit and 1.3 Mbps/receive, along with around 1300 Pps.

I've verified the metrics in different monitoring solutions including Elastic Observability and Promethues/Grafana, so they doesn't seem to be a "trick of the light".

The network utilization seems high to me given the frequency of the heartbeat and the size of the request/response. Furthermore, I would expect the network I/O pattern for a heartbeat to have spikes of network utilization, rather than it being sustained.

Originally, I had about 8 monitors in the config and observed the same utilization. In troubleshooting, I reduced it down to 1 monitor but this didn't seem to have an impact.

Has anyone else experienced this? While Heartbeat appears to be working (I'm receiving the signal in Elasticsearch), perhaps there's a misconfiguration in my heartbeat.yml?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.