How do I retrieve statistics from two CEPH clusters using metricbeat autodiscover?

I am retrieving statistics from a CEPH cluster running in a kubernetes deployment using this values file:

metricbeat: 
  extraEnvs:
    - name: CEPH_API_USERNAME
      value: monitoring-ceph
    - name: CEPH_API_PASSWORD
      valueFrom:
        secretKeyRef:
          name: ceph-api-user
          key: monitoring-ceph
  config:
     metricbeat.autodiscover:
       providers:
         - type: kubernetes
           host: ${NODE_NAME}
           templates:
             - condition.equals:
                 kubernetes.labels.rook_cluster: "rook-ceph"
               config:
                 - module: ceph
                   metricsets:
                   - mgr_cluster_disk
                   - mgr_osd_perf
                   - mgr_pool_disk
                   - mgr_osd_pool_stats
                   - mgr_osd_tree
                   period: 10s
                   hosts: ["https://${data.host}:8003"]
                   username: '${CEPH_API_USERNAME}'
                   password: '${CEPH_API_PASSWORD}'
                   ssl.verification_mode: "none"
                 - module: prometheus
                   period: 10s
                   hosts: ["${data.host}:9283"]
                   metrics_path: /metrics
                   metrics_filters:
                     include: ["ceph_osd_stat_byte*"]

This works as expected, now I want to add a second CEPH cluster, so I apply a second values file:

metricbeat: 
  extraEnvs:
    - name: CEPH_API_USERNAME
      value: monitoring-ceph
    - name: CEPH_API_PASSWORD
      valueFrom:
        secretKeyRef:
          name: ceph-rdg-api-user
          key: monitoring-ceph
  config:
     metricbeat.autodiscover:
       providers:
         - type: kubernetes
           host: ${NODE_NAME}
           templates:
             - condition.equals:
                 kubernetes.labels.rook_cluster: "rdg-rook-ceph"
               config:
                 - module: ceph
                   metricsets:
                   - mgr_cluster_disk
                   - mgr_osd_perf
                   - mgr_pool_disk
                   - mgr_osd_pool_stats
                   - mgr_osd_tree
                   period: 10s
                   hosts: ["https://${data.host}:8003"]
                   username: '${CEPH_API_USERNAME}'
                   password: '${CEPH_API_PASSWORD}'
                   ssl.verification_mode: "none"
                 - module: prometheus
                   period: 10s
                   hosts: ["${data.host}:9283"]
                   metrics_path: /metrics
                   metrics_filters:
                     include: ["ceph_osd_stat_byte*"]

Now, it appears that the last values file applied is the only one that is working. Is this the correct way to use autodiscover? The only difference between the two files is kubernetes.labels.rook_cluster which distinguishes the two clusters.

I got this working with the following configuration:

metricbeat: 
  extraEnvs:
    - name: CEPH_API_USERNAME
      value: monitoring-ceph
    - name: CEPH_API_PASSWORD
      valueFrom:
        secretKeyRef:
          name: ceph-api-user
          key: monitoring-ceph          
    - name: RDG_CEPH_API_USERNAME
      value: rdg-monitoring-ceph
    - name: RDG_CEPH_API_PASSWORD
      valueFrom:
        secretKeyRef:
          name: rdg-ceph-api-user
          key: rdg-monitoring-ceph          
  config:
     metricbeat.autodiscover:
       providers:
         - type: kubernetes
           identifier: rook-ceph
           host: ${NODE_NAME}
           templates:
             - condition.equals:
                  kubernetes.labels.rook_cluster: "rook-ceph"
               config:
                 - module: ceph
                   metricsets:
                   - mgr_cluster_disk
                   - mgr_osd_perf
                   - mgr_pool_disk
                   - mgr_osd_pool_stats
                   - mgr_osd_tree
                   period: 10s
                   hosts: ["https://<hardcoded ip of CEPH cluster manager pod>:8003"]
                   username: '${CEPH_API_USERNAME}'
                   password: '${CEPH_API_PASSWORD}'
                   ssl.verification_mode: "none"
                 - module: prometheus
                   period: 10s
                   hosts: ["<hardcoded ip of CEPH cluster manager pod>:9283"]
                   metrics_path: /metrics
                   metrics_filters:
                     include: ["ceph_osd_stat_byte*"]
         - type: kubernetes
           identifier: rdg-rook-ceph
           host: ${NODE_NAME}
           templates:
             - condition.equals:
                  kubernetes.labels.rook_cluster: "rdg-rook-ceph"
               config:
                 - module: ceph
                   metricsets:
                   - mgr_cluster_disk
                   - mgr_osd_perf
                   - mgr_pool_disk
                   - mgr_osd_pool_stats
                   - mgr_osd_tree
                   period: 10s
                   hosts: ["https://<hardcoded ip of RDG CEPH cluster manager pod>:8003"]
                   username: '${RDG_CEPH_API_USERNAME}'
                   password: '${RDG_CEPH_API_PASSWORD}'
                   ssl.verification_mode: "none"
                 - module: prometheus
                   period: 10s
                   hosts: ["<hardcoded ip of RDG CEPH cluster manager pod>:9283"]
                   metrics_path: /metrics
                   metrics_filters:
                     include: ["ceph_osd_stat_byte*"]

For some reason ${data.host} resolved to some random pod within the ceph namespace. It should be the Ceph Manager Daemon pod. I don't know why the correct one wasn't picked up. I had to hard code the ceph manager daemon pod ip. I think the matching criteria is too broad. I need to match against kubernetes.labels.app=rook-ceph-mgr which will match two pods but then to further filter it, then I need kubernetes.labels.rook_cluster=rdg-rook-ceph and kubernetes.labels.rook_cluster=rook-ceph. I am not sure how to do this with condition.equals though.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.