Configure Metricbeat for RabbitMQ cluster to avoid duplication per node

JorenThijs · March 5, 2024, 11:57am

Hi we are using metricbeat v7.17.18 to gather information about our kubernetes cluster and send it too our Kibana cloud instance.

In kubernetes we are also running a RabbitMQ Servicebus cluster meaning we have 3 RabbitMQ nodes who togehter form our one servicebus. From RabbitMQ we try to collect metrics about the individual nodes and the queue sizes.

The queue sizes are the most important because we want to visualize how many failed messages we have in _error queue's. In the past our setup was very simple and we only had one rabbitMQ node in the cluster. Now we have 3. Since we have 3 nodes, metrics about queue sizes are sent to kibana every 60 seconds for each node in the RabbitMQ cluster. This caused our data from RabbitMQ to tripple in size. Below is an example of the duplication in Kibana. Note how the same queuesize is logged with 60s interval for each of the 3 rabbitMQ servers (nodes).

Since the queue's that we want to monitor are replicated acrross the 3 nodes. We don't want to log it individualy 3 times. Is there a way to only report the queue size once for the whole cluster instead of for each RabbitMQ node? We currently configured metricbeat with autodiscovery and put hints on the node templates of the RabbitMQ cluster. Is this there a better way?

This how we currently configure metricbeat and RabbitMQ:

Metricbeat deamonset for the cluster:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: cluster-monitoring-mb
spec:
  type: metricbeat
  version: 7.17.18
  config:
    metricbeat:
      autodiscover:
        providers:
          - type: kubernetes # leader (cluster) config
            scope: cluster
            unique: true
            hints:
              enabled: true
            templates:
              - config:
                  - module: kubernetes
                    metricsets:
                      - event
          - type: kubernetes # Non-leader (Node) config for hints-based (e.g. rabbit mq) monitoring (see https://www.elastic.co/guide/en/beats/metricbeat/7.17/configuration-autodiscover-hints.html)
            scope: node
            node: ${NODE_NAME}
            hints:
              enabled: true
          - type: kubernetes # Non-leader (Node) config for node-based monitoring
            scope: node
            node: ${NODE_NAME}
            templates:
              - config:
                  - module: kubernetes
                    host: ${NODE_NAME}
                    hosts:
                      - https://${NODE_NAME}:10250
                    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
                    ssl:
                      verification_mode: none
                    period: 120s
                    metricsets:
                      - node
                      - system
                      - pod
                      - container
                      - volume
      modules:
        - module: system
          period: 120s
          metricsets:
            - cpu
            - load
            - memory
            - network
            - process
            - process_summary
          process:
            include_top_n:
              by_cpu: 5
              by_memory: 5
          processes:
            - .*
        - module: system
          period: 1m
          metricsets:
            - filesystem
            - fsstat
          processors:
            - drop_event:
                when:
                  regexp:
                    system:
                      filesystem:
                        mount_point: ^/(sys|cgroup|proc|dev|etc|host|lib)($|/)
    output.elasticsearch:
      hosts: ["${ELASTICSEARCH_HOST}"]
      api_key: ${ELASTICSEARCH_API_KEY_DECODED}
    setup.ilm.enabled: true
    setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-${environment_underscores_lowercase}$"
    setup.ilm.pattern: "{now/M{yyyy.MM}}-000001"
    setup.ilm.policy_name: "metricbeat-${environment_underscores_lowercase}$"
    setup.ilm.overwrite: false

  daemonSet:
    podTemplate:
      metadata:
        labels:
          app: cluster-monitoring-mb
      spec:
        priorityClassName: system-node-critical
        serviceAccountName: cluster-monitoring-mb
        automountServiceAccountToken: true # some older Beat versions are depending on this settings presence in k8s context
        containers:
          - args:
              - -e
              - -c
              - /etc/beat.yml
              - -system.hostfs=/hostfs
            name: metricbeat
            volumeMounts:
              - mountPath: /hostfs/sys/fs/cgroup
                name: cgroup
              - mountPath: /var/run/docker.sock
                name: dockersock
              - mountPath: /hostfs/proc
                name: proc
            env:
              - name: ELASTICSEARCH_HOST
                value: ${ELASTIC_MONITORING_URI}$
              - name: ELASTICSEARCH_API_KEY_DECODED
                value: ${ELASTIC_MONITORING_API_KEY_DECODED}$
              - name: NODE_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: spec.nodeName
        dnsPolicy: ClusterFirstWithHostNet
        hostNetwork: true # Allows to provide richer host metadata
        securityContext:
          runAsUser: 0
        terminationGracePeriodSeconds: 30
        volumes:
          - hostPath:
              path: /sys/fs/cgroup
            name: cgroup
          - hostPath:
              path: /var/run/docker.sock
            name: dockersock
          - hostPath:
              path: /proc
            name: proc
---
# permissions needed for metricbeat
# source: https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-kubernetes.html
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-monitoring-mb
rules:
  - apiGroups:
      - ""
    resources:
      - nodes
      - namespaces
      - events
      - pods
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "extensions"
    resources:
      - replicasets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - apps
    resources:
      - statefulsets
      - deployments
      - replicasets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - nodes/stats
    verbs:
      - get
  - nonResourceURLs:
      - /metrics
    verbs:
      - get
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-monitoring-mb
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-monitoring-mb
subjects:
  - kind: ServiceAccount
    name: cluster-monitoring-mb
    namespace: ${K8S_NAMESPACE}$
roleRef:
  kind: ClusterRole
  name: cluster-monitoring-mb
  apiGroup: rbac.authorization.k8s.io

RabbitMQ cluster:

# The secret here needs to be defined before the RabbitMqCluster
# otherwise the rabbit mq cluster operator will create a secret with a random username/password instead of our values
apiVersion: v1
kind: Secret
metadata:
  name: rabbit-mq-default-user
data:
  default_user.conf: ${default_user_conf_b64}$
  username: ${rabbitmq_username_b64}$
  password: ${rabbitmq_password_b64}$
  provider: cmFiYml0bXE= # base64 encoded 'rabbitmq' is always the same
  type: cmFiYml0bXE= # base64 encoded 'rabbitmq' is always the same
---
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: rabbit-mq
spec:
  image: masstransit/rabbitmq:3.9
  replicas: 3 # Must be an odd number, see https://www.rabbitmq.com/clustering.html#node-count
  override:
    statefulSet:
      spec:
        template:
          metadata:
            annotations:
              config.linkerd.io/default-inbound-policy: all-unauthenticated
              co.elastic.metrics/module: rabbitmq
              co.elastic.metrics/hosts: "${data.host}:15672/rmq-mgmt"
              co.elastic.metrics/metricsets: "node, queue"
              co.elastic.metrics/period: 60s
              co.elastic.metrics/username: ${kubernetes.${K8S_NAMESPACE}$.rabbit-mq-default-user.username}
              co.elastic.metrics/password: ${kubernetes.${K8S_NAMESPACE}$.rabbit-mq-default-user.password}
  rabbitmq:
    additionalConfig: |
      management.path_prefix = /rmq-mgmt
    additionalPlugins:
      - rabbitmq_shovel
      - rabbitmq_shovel_management
      - rabbitmq_delayed_message_exchange
---
apiVersion: k8s.nginx.org/v1
kind: VirtualServerRoute
metadata:
  name: rabbit-mq
spec:
  host: ${HOSTNAME}$
  upstreams:
    - name: rabbit-mq
      service: rabbit-mq
      port: 15672
  subroutes:
    - path: /rmq-mgmt
      action:
        pass: rabbit-mq

strawgate · March 23, 2024, 1:32pm

I'm not super familiar with auto discover but you may be able to define a processor either in the hints or in your metricbeat config that keeps queue messages when they come from say node 1 but drops them from nodes 2 and 3

system · April 20, 2024, 3:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
RabbitMQ cluster - node metrics reported multiple times Beats metricbeat	5	714	April 12, 2018
RabbitMQ Queue metrics Beats metricbeat	6	938	January 5, 2018
Change in the RabbitMq module to support more node information Beats metricbeat	2	480	March 31, 2018
Metricbeat RabbitMQ module not sending all metric sets Beats metricbeat	6	806	September 13, 2018
0 RabbitMQ Nodes / No results found Beats metricbeat	3	571	March 13, 2018

Configure Metricbeat for RabbitMQ cluster to avoid duplication per node

Related topics