Metricbeat not sending data or Elasticsearch not recieving (?)

Hi Guys,

hope you can help me out in the following.

i'm using:

  • Elasticseearch 8.4.3
  • metricbeat 8.4.3
  • kubernetes / AWS EKS

I've deployed metricbeat in a kubernetes cluster. it gathering all sort of data and i also want it scrape a prometheus endpoint, the endpoint is available via:

testservice.default:9090

my prometheus module config looks like this:

    - module: prometheus
      metricsets: ["collector"]
      enabled: true
      period: 10s
      hosts: ["testservice.default:9090"]
      metrics_path: /metrics

so this seems to be working, as i don't see any error in the logs of metricbeat. However i don't see these logs coming in to my elasticstack. (other logs that metricbeat is collecting are being received).

i turned on debug logging in metricbeat to see if this would get me any further as to identify what the issue could be, however also here there is no error.
i see this log line:

{"log.level":"debug","@timestamp":"2023-03-02T13:56:10.972Z","log.logger":"module","log.origin":{"file.name":"module/wrapper.go","file.line":191},"message":"Starting metricSetWrapper[module=prometheus, name=collector, host=testservice.default:9090]","service.name":"metricbeat","ecs.version":"1.6.0"}

witch suggest to me the module is starting.
I also start seeing the log lines i want in the debug logs. How ever i don't see them appearing in Elasticsearch. I searched all indexes via:

GET /_search
{
  "query": {
    "query_string": {
      "query": "myapp"
    }
  }
}

and the log lines are not present in any index..
the elasticsearch logs them self don't show any error's

What am i doing wrong?
Any help or suggestion will be appreciated!

Hi @enigmatic

did you try to access metrics endpoint from the metricbeat container, using the curl?

kubectl exec -it metricbeat-pod bash
$ curl testservice.default:9090/metrics

are you getting a valid response?

Hi Tetiana,

Thanks for your reply, yes, i'm getting a valid response.

also in the debug log of metricbeat i can see the log lines I'm looking for.

However i don't see these logs coming in to my elasticstack. (other logs that metricbeat is collecting are being received).

could you clarify it?
do you have one metricbeat container, that collect prometheus metrics and metrics from other application(s)? both modules are using the same output configuration?

do you install metricbeat using DaemonSet, similar to this example?

Hi Tetiana,

yes, i deploy metricbeat via a daemonset (like in the link)

i have 2 nodes running at the moment. on each node a container with metricbeat is running.
They are using indeed the same output configuration.
I have the kubernetes, system and prometheus modules running.

not sure if relevant, but the containers also elect a leadership, in the logs that becomes clear.

Kr,
Nathan

so i added Prometheus/Grafana within the same cluster and was able to get/see the data i want
(using the autodiscover function).
i 'm completely lost as to why this isn't working with metricbeat.

Kr,
Nathan

Can you go to Kibana Discover and see if there are any documents with the data view metricbeat-* or metrics-*?

If so are there any fields that start with prometheus

Hi @stephenb
Thanks for your reply!
There aren't any fields in the index that start with prometheus.

edit, this is not entirely true.
i have some other applications on prem where i use the prometheus module.
those are ingested successfully in elasticsearch.

But you are getting the system metrics etc?

This is just a suggestion... What I would do is temporarily disable the kubernetes and system modules in your metricbeat config.

Then set
logging.level: debug

And look at the metricbeat logs...

You can also set the -d "*" flag on the metribeat command in the config that will even produce more output...

What exactly does this mean? can you elaborate...

In the logs there should be some log entries that have events.published etc do you see those with only the prometheus modules configured?

Hi @stephenb ,

sorry for the confusion, i just updated my post before i saw you comment.

  • yes i'm getting system metrics.
  • thanks for the suggestion! i will do that tomorrow morning.

in the debug logs i can see the event's that metricbeat get's from the prometheus endpoint that is configured. (based on the content of the event)

the line starts with:

{"log.level":"debug","@timestamp":"2023-03-06T09:26:22.760Z","log.logger":"processors","log.origin":{"file.name":"processing/processors.go","file.line":210},"message":"Publish event:

i also see that elasticsearch is publishing events

{"log.level":"debug","@timestamp":"2023-03-06T09:26:23.158Z","log.logger":"elasticsearch","log.origin":{"file.name":"elasticsearch/client.go","file.line":247},"message":"PublishEvents: 50 events have been published to elasticsearch in 43.16418ms.","service.name":"metricbeat","ecs.version":"1.6.0"}

i will update further tomorrow when i disable kubernetes and system modules.

Apologies but elasticsearch does not publish event metricbeat does... so to be clear you are looking at the metricbeat logs?

If you exec into metricbeat pod you should be able to run a test of the modules this should hit the prometheus endpoint / config you have in the modules.

./metricbeat test modules
prometheus...
  collector...OK
    result: 
    {
     "@timestamp": "2023-03-08T16:32:59.461Z",
     "event": {
      "dataset": "prometheus.collector",
      "duration": 3328208,
      "module": "prometheus"
     },
     "metricset": {
      "name": "collector",
      "period": 10000
     },
     "prometheus": {
      "labels": {
       "code": "200",
       "handler": "found",
       "instance": "localhost:8080",
       "job": "prometheus",
       "le": "0.1",
       "method": "get"
      },
      "metrics": {
       "http_request_duration_seconds_bucket": 2
      }
     },
     "service": {
      "address": "http://localhost:8080/metrics",
      "type": "prometheus"
     }
    }

Also if you have log.level:debug

you should see outputs of the with \"dataset\": \"prometheus.collector\",\n

{"log.level":"debug","@timestamp":"2023-03-08T08:35:45.366-0800","log.logger":"processors","log.origin":{"file.name":"processing/processors.go","file.line":210},"message":"Publish event: {\n  \"@timestamp\": \"2023-03-08T16:35:45.362Z\",\n  \"@metadata\": {\n    \"beat\": \"metricbeat\",\n    \"type\": \"_doc\",\n    \"version\": \"8.4.3\"\n  },\n  \"event\": {\n    \"dataset\": \"prometheus.collector\",\n    \"module\": \"prometheus\",\n    \"duration\": 3688206\n  },\n  \"metricset\": {\n    \"name\": \"collector\",\n    \"period\": 10000\n  },\n  \"prometheus\": {\n    \"labels\": {\n      \"handler\": \"found\",\n      \"method\": \"get\",\n      \"le\": \"2.5\",\n      \"instance\": \"localhost:8080\",\n      \"job\": \"prometheus\",\n      \"code\": \"200\"\n    },\n    \"metrics\": {\n      \"http_request_duration_seconds_bucket\": 2\n    }\n  },\n  \"service\": {\n    \"address\": \"http://localhost:8080/metrics\",\n    \"type\": \"prometheus\"\n  },\n  \"host\": {\n    \"os\": {\n      \"name\": \"macOS\",\n      \"kernel\": \"21.6.0\",\n      \"build\": \"21G419\",\n      \"type\": \"macos\",\n      \"platform\": \"darwin\",\n      \"version\": \"12.6.3\",\n      \"family\": \"darwin\"\n    },\n    \"id\": \"9E46F076-B7F1-53AA-921B-C2F983746B79\",\n    \"name\": \"hyperion\",\n    \"ip\": [\n      \"fe80::aede:48ff:fe00:1122\",\n      \"fe80::183a:c2d4:bc27:f1ec\",\n      \"192.168.1.159\",\n      \"fe80::3468:7ff:fe33:fcee\",\n      \"fe80::3468:7ff:fe33:fcee\",\n 

Also Look in the looks for a mapping exceptions

@stephenb ,

sorry for the confusion I'm looking indeed in the metricbeat logs

This morning i disabled all modules, and looked at the debug logs again, below information is based on that.

i did this, the module is working. The output of the command is:

prometheus...
  collector...OK
    result:
    {
     "@timestamp": "2023-03-09T08:21:48.304Z",
     "event": {
      "dataset": "prometheus.collector",
      "duration": 18134449,
      "module": "prometheus"
     },
     "metricset": {
      "name": "collector",
      "period": 10000
     },
     "prometheus": {
      "labels": {
       "instance": "testservice.default:9090",
       "job": "prometheus",
       "port": "6001",
       "space": "code"
      },
      "metrics": {
       "app_nodejs_heap_space_size_available_bytes": 26432,
       "app_nodejs_heap_space_size_total_bytes": 1155072,
       "app_nodejs_heap_space_size_used_bytes": 1046720
      }
     },
     "service": {
      "address": "http://testservice.default:9090/metrics",
      "type": "prometheus"
     }
    }

correct, i can see these events in the debug logs.

i dont see any exception / mapping error in the logs.

there are a few errors in the debug logs, but they are related to add_cloud_metadata, like:

{"log.level":"debug","@timestamp":"2023-03-09T08:14:28.469Z","log.logger":"add_cloud_metadata","log.origin":{"file.name":"add_cloud_metadata/providers.go","file.line":167},"message":"add_cloud_metadata: received disposition for huawei after 1.491925ms. result=[provider:huawei, error=failed with http status code 401, metadata={}]","service.name":"metricbeat","ecs.version":"1.6.0"}

but i expect this, as the node / cluster is in AWS and not Huawei.

so i started looking furthere in elasticsearch, looking if i have overlooked something, and it turns out i was searching for the wrong data!
i was searching for the appname (as i've given the logs a special prefix) , how ever this prefix was used in the fieldname in elasticsearch, so i never looked at it correctly :expressionless:

@stephenb thanks for your help, the suggestion to disable all other modules really did the trick for me!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.