Prometheus metricset - query vs. collector

Hi!

At the moment I am collecting metrics from my OpenShift cluster with the collector metricset like so:

metricbeat.modules:
- module: prometheus
  period: 15s
  timeout: 15s
  hosts: ["https://prometheus-k8s.openshift-monitoring.svc.cluster.local:9091"]
  metrics_path: '/federate'
  query:
    'match[]': '{__name__=" *and a bunch of different metrics that I am interested in*"}'

However, I stumbled upon that it's possible to run a query on the prometheus server instead.
I could do this, for example:

- module: prometheus
  period: 15s
  timeout: 15s
  hosts: ["https://prometheus-k8s.openshift-monitoring.svc.cluster.local:9091"]
  metricsets: ["query"]
  queries:
  - name: 'cluster_operator_up'
    path: '/api/v1/query'
    params:
       query: "cluster_operator_up"

Is there a preferred way to do this?
Do you think the different metricsets could generate a different load depending on how many metrics I am querying or collecting?

Right now the OpenShift cluster is small (3 master, 3 worker) but it will scale out in the future. I have deployed metricbeat as a single pod with a Deployment. It outputs the data to an external Logstash cluster.

Any insight would be much appreciated!

Hello,

Initially on the collector configuration I see that you use /federate endpoint.
As per documentation[Prometheus collector metricset | Metricbeat Reference [8.6] | Elastic], Federation API returns all metrics as untyped, so it does not support the calculation of rate types. So if you care for such kind of metrics maybe you can switch back to default endpoint which is /metrics

Depending on scale we recommend to use another way of configuration the remote_write ( remote_write). This requires additional configuration of your Prometheus server to be able to push to your remote_write destination

Now if you have to choose between collector and query, it really depends on how specific are your queries. Collector configuration can group metrics by labels, so some extra processing is done in beats side but with query you can be always on safe side and control your retrieved metrics at any time. Also collector can support additional filters to include or exclude metrics. So if you anticipate big cardinality in your metrics query might be a better option

1 Like

Thank you, Andreas

I think I understand how and when to use each metricset now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.