I'm doing a POC to use a metricbeat pod to scrape all metrics from a prometheus pod in the same namespace of a kubernetes cluster. The metricbeat pod is able to connect to the prometheus pod's /federate end point and gather results. Something in the result set is causing a JSON processing error. I found another post with a similar error and someone recommended turning the DEBUG logging level on. I did that and harvested a the errors and the associated JSON.
First, here is the configuration of the metricbeat pod:
Here are the configmaps the yaml refers to: #MetricBeat Modules ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: metricbeat-daemonset-modules
namespace: monitoring
labels:
app: metricbeat-prometheus
data:
prometheus.yml: |-
- module: prometheus
period: 10s
hosts: ["prometheus-k8s.monitoring:9090"]
metrics_path: "/federate"
query:
match: '{name)!=""}' #username: "user" #password: "secret"
# This can be used for service account based authorization:
# bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
#ssl.certificate_authorities:
# - /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
There is another very similar post where it was recommended to enable the DEBUG logging level. You can see that post here. I have done that and I have some data captured from a transaction but it's very large. Assuming someone is willing to help, could I send you the file directly instead of posting it here? I didn't see a way to attach a file.
Hi @drew.flint Sorry for the late response! I see we just opened https://github.com/elastic/beats/issues/13750 to track more NaN problem. I didn't see any error message in your previous comments but I assume this is the issue you are having?
I was able to identify the NaN labels in Prometheus and built a query to exclude those. However, now I'm only receiving "Unable to decode response from prometheus endpoint". I'm failing in a whole new and exciting way! Any help you may have to offer would be greatly appreciated. Thanks!
I believe issues 13749 and 13750 are both associated with the Elastic Support case I opened shortly after creating this community post. If the case number would be helpful, I'm glad to get that for you. If it's any help, I'm able to run a curl command and get results from a /bin/bash shell on the metricbeat container.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.