Hi we are using metricbeat v7.17.18 to gather information about our kubernetes cluster and send it too our Kibana cloud instance.
In kubernetes we are also running a RabbitMQ Servicebus cluster meaning we have 3 RabbitMQ nodes who togehter form our one servicebus. From RabbitMQ we try to collect metrics about the individual nodes and the queue sizes.
The queue sizes are the most important because we want to visualize how many failed messages we have in _error queue's. In the past our setup was very simple and we only had one rabbitMQ node in the cluster. Now we have 3. Since we have 3 nodes, metrics about queue sizes are sent to kibana every 60 seconds for each node in the RabbitMQ cluster. This caused our data from RabbitMQ to tripple in size. Below is an example of the duplication in Kibana. Note how the same queuesize is logged with 60s interval for each of the 3 rabbitMQ servers (nodes).
Since the queue's that we want to monitor are replicated acrross the 3 nodes. We don't want to log it individualy 3 times. Is there a way to only report the queue size once for the whole cluster instead of for each RabbitMQ node? We currently configured metricbeat with autodiscovery and put hints on the node templates of the RabbitMQ cluster. Is this there a better way?
This how we currently configure metricbeat and RabbitMQ:
Metricbeat deamonset for the cluster:
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: cluster-monitoring-mb
spec:
type: metricbeat
version: 7.17.18
config:
metricbeat:
autodiscover:
providers:
- type: kubernetes # leader (cluster) config
scope: cluster
unique: true
hints:
enabled: true
templates:
- config:
- module: kubernetes
metricsets:
- event
- type: kubernetes # Non-leader (Node) config for hints-based (e.g. rabbit mq) monitoring (see https://www.elastic.co/guide/en/beats/metricbeat/7.17/configuration-autodiscover-hints.html)
scope: node
node: ${NODE_NAME}
hints:
enabled: true
- type: kubernetes # Non-leader (Node) config for node-based monitoring
scope: node
node: ${NODE_NAME}
templates:
- config:
- module: kubernetes
host: ${NODE_NAME}
hosts:
- https://${NODE_NAME}:10250
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl:
verification_mode: none
period: 120s
metricsets:
- node
- system
- pod
- container
- volume
modules:
- module: system
period: 120s
metricsets:
- cpu
- load
- memory
- network
- process
- process_summary
process:
include_top_n:
by_cpu: 5
by_memory: 5
processes:
- .*
- module: system
period: 1m
metricsets:
- filesystem
- fsstat
processors:
- drop_event:
when:
regexp:
system:
filesystem:
mount_point: ^/(sys|cgroup|proc|dev|etc|host|lib)($|/)
output.elasticsearch:
hosts: ["${ELASTICSEARCH_HOST}"]
api_key: ${ELASTICSEARCH_API_KEY_DECODED}
setup.ilm.enabled: true
setup.ilm.rollover_alias: "metricbeat-%{[agent.version]}-${environment_underscores_lowercase}$"
setup.ilm.pattern: "{now/M{yyyy.MM}}-000001"
setup.ilm.policy_name: "metricbeat-${environment_underscores_lowercase}$"
setup.ilm.overwrite: false
daemonSet:
podTemplate:
metadata:
labels:
app: cluster-monitoring-mb
spec:
priorityClassName: system-node-critical
serviceAccountName: cluster-monitoring-mb
automountServiceAccountToken: true # some older Beat versions are depending on this settings presence in k8s context
containers:
- args:
- -e
- -c
- /etc/beat.yml
- -system.hostfs=/hostfs
name: metricbeat
volumeMounts:
- mountPath: /hostfs/sys/fs/cgroup
name: cgroup
- mountPath: /var/run/docker.sock
name: dockersock
- mountPath: /hostfs/proc
name: proc
env:
- name: ELASTICSEARCH_HOST
value: ${ELASTIC_MONITORING_URI}$
- name: ELASTICSEARCH_API_KEY_DECODED
value: ${ELASTIC_MONITORING_API_KEY_DECODED}$
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true # Allows to provide richer host metadata
securityContext:
runAsUser: 0
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /sys/fs/cgroup
name: cgroup
- hostPath:
path: /var/run/docker.sock
name: dockersock
- hostPath:
path: /proc
name: proc
---
# permissions needed for metricbeat
# source: https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-kubernetes.html
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-monitoring-mb
rules:
- apiGroups:
- ""
resources:
- nodes
- namespaces
- events
- pods
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- statefulsets
- deployments
- replicasets
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/stats
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: cluster-monitoring-mb
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-monitoring-mb
subjects:
- kind: ServiceAccount
name: cluster-monitoring-mb
namespace: ${K8S_NAMESPACE}$
roleRef:
kind: ClusterRole
name: cluster-monitoring-mb
apiGroup: rbac.authorization.k8s.io
RabbitMQ cluster:
# The secret here needs to be defined before the RabbitMqCluster
# otherwise the rabbit mq cluster operator will create a secret with a random username/password instead of our values
apiVersion: v1
kind: Secret
metadata:
name: rabbit-mq-default-user
data:
default_user.conf: ${default_user_conf_b64}$
username: ${rabbitmq_username_b64}$
password: ${rabbitmq_password_b64}$
provider: cmFiYml0bXE= # base64 encoded 'rabbitmq' is always the same
type: cmFiYml0bXE= # base64 encoded 'rabbitmq' is always the same
---
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: rabbit-mq
spec:
image: masstransit/rabbitmq:3.9
replicas: 3 # Must be an odd number, see https://www.rabbitmq.com/clustering.html#node-count
override:
statefulSet:
spec:
template:
metadata:
annotations:
config.linkerd.io/default-inbound-policy: all-unauthenticated
co.elastic.metrics/module: rabbitmq
co.elastic.metrics/hosts: "${data.host}:15672/rmq-mgmt"
co.elastic.metrics/metricsets: "node, queue"
co.elastic.metrics/period: 60s
co.elastic.metrics/username: ${kubernetes.${K8S_NAMESPACE}$.rabbit-mq-default-user.username}
co.elastic.metrics/password: ${kubernetes.${K8S_NAMESPACE}$.rabbit-mq-default-user.password}
rabbitmq:
additionalConfig: |
management.path_prefix = /rmq-mgmt
additionalPlugins:
- rabbitmq_shovel
- rabbitmq_shovel_management
- rabbitmq_delayed_message_exchange
---
apiVersion: k8s.nginx.org/v1
kind: VirtualServerRoute
metadata:
name: rabbit-mq
spec:
host: ${HOSTNAME}$
upstreams:
- name: rabbit-mq
service: rabbit-mq
port: 15672
subroutes:
- path: /rmq-mgmt
action:
pass: rabbit-mq