Jolokia autodiscovery

Hi,
I am currently in use of ELK cluster based on version 7.3.1 under Kubernetes environment. I also use Jolokia module inside metricbeat in order to gather many metrics. I wander does anyone can help me regarding Jolokia auto-discovery problem I have in recent versions.
Description: In past I used Kubernetes autodiscovery together with Jolokia module in order to detect changes in Kubernetes cluster (new services, removal of old services... in order to get required jmx metrics from them). This combination worked perfectly until version 7.3.1 Now there are often missing references inside metricbeat when Jolokia looses reference on deleted services. Data are still incoming but lost reference accumulate over time. New services are detected without problems by Jolokia. Problem persist only when service is deleted from Kubernetes cluster.

I am aware that Jolokia module has it own auto-discovery now. Can someone give me a clue how to configure Jolokia auto discovery parameters (name - interface name, grace period, probe timeout) but in case metricbeat runs inside Kubernetes cluster with services that exposes metrics via Jolokia. Thanks in advance

This is a part of metricbeat configuration regarding usage of Kubernetes auto-discovery together with Jolokia module. This worked correctly on previous versions of metricbeat. We also tried Metricbeat 7.5 but problem did not go away.

Configuration:
metricbeat.autodiscover:
providers:
- type: kubernetes
templates:
- condition:
contains:
kubernetes.pod.name: "some-kubernetes-pod"
config:
- module: jolokia
metricsets: ["jmx"]
...

So could someone help me in order to set Jolokia own auto-discovery, like in the end of official documentation regarding Jolokia provider for auto-discovery, but in the case of Metricbeat inside Kubernetes cluster:
https://www.elastic.co/guide/en/beats/metricbeat/master/configuration-autodiscover.html

I also forget to post error in Metricbeat logs when Jolokia module looses reference on service from which it pulls metrics:
INFO module/wrapper.go:252 Error fetching data for metricset jolokia.jmx: error making http request: Post http://<some_ip>:7777/jolokia/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
This happens in case if I delete pods from which Jolokia module puled metrics.

If some other service come to existence in meantime on same IP, Jolokia starts to pull metrics from this new service. This can create great problems because metrics of one service end up presented as metrics of completely different service. Why Jolokia module is not aware of removal of services from Kubernetes cluster is something I tried by failed to explain. I am guessing that new Joloka auto-discovery provider should be used but it lacks proper documentation in case Metricbeat and Jolokia module are used inside Kubernetes cluster. Any help?

Is it possible that problem exists in related form on other Kubernetes setups including Metricbeat as primary source of metrics? Could we at least share problems that belongs to described set of challenges. Maybe this could provide us with solution.

Sorry for the late response! For the first question about jolokia autodiscover configuration, here is the documentation: https://www.elastic.co/guide/en/beats/metricbeat/current/configuration-autodiscover.html#_jolokia

Seem to me, there is some problem with kubernetes autodiscover: when service is deleted from k8s cluster, autodiscover doesn't detect that. @exekias Do you know any known issue around this?

Hey @Milan_Todorovic,

If you are using Kubernetes I recommend you to continue using Kubernetes autodiscover, that will work for all applications you deploy in Kubernetes, no matter if they are Java applications or not. Jolokia autodiscover is intended for deployments where no other supported orchestrator is used.

Regarding the errors when pods are deleted, I think they are caused because Beats have a grace period before stop monitoring a stopped pod, this is specially useful in filebeat, where the logs may have not been fully read when the pod stops, but it is true that its usefulness in metricbeat is more limited. You can disable this grace period by setting cleanup_timeout: 0s in your autodiscover provider configuration.

1 Like

Hi, thanks for your suggestion. I added cleanup_timeout: 0s inside add_docker_metadata section of metricbeat config file. But after restarts of few Kubernetes pods I have:
INFO module/wrapper.go:252 Error fetching data for metricset jolokia.jmx: error making http request: Post http://<some_ip>:7777/jolokia/: dial tcp <some_ip>:7777: connect: no route to host

This is not much different from previous error.

This option should be added into the kubernetes provider section, something like this:

metricbeat.autodiscover:
  providers:
  - type: kubernetes
    cleanup_timeout: 0s
    templates:
      ...

add_kubernetes_metadata shouldn't be needed in your case, modules instantiated by autodiscover already add this metadata. add_kubernetes_metadata is needed for events collected by static configurations (not like the ones created by autodiscover provider).

This unfortunately did not work. We still have identical problem. Any suggestions? Thanks

Could you share the configuration you have now?

It looks like this:

metricbeat.modules:
- module: kubernetes
  labels.dedot: true
  metricsets:
    - node
    - system
    - pod
    - container
  period: 30s
  hosts: ["${NODE_NAME}:10255"]
- module: docker
  metricsets: ["diskio"]
  hosts: ["unix:///var/run/docker.sock"]
  period: 30s

metricbeat.autodiscover:
  providers:
      - type: kubernetes
	cleanup_timeout: 0s
        templates:
          - condition:
              contains:
                kubernetes.pod.name: "something"
            config:
              - module: jolokia
                metricsets: ["jmx"]
                hosts: "${data.host}:7777/jolokia/"
                namespace: "something"
                jmx.mappings:
......................................................

processors:
- add_docker_metadata: ~

output.elasticsearch:
  hosts: ${ELASTICSEARCH_HOST}:${ELASTICSEARCH_PORT}

setup.kibana:
  host: "kibana:80"

setup.template.overwrite: true
setup.template.fields: "fields.yml"

cleanup_timeout should be at the same level than type and templates, from the config you have copied it seems to have one more indentation level, it should be like this:

...
metricbeat.autodiscover:
  providers:
      - type: kubernetes
        cleanup_timeout: 0s
        templates:
...

Did you try to upgrade metricbeat? It'd be good to confirm if your issues persist on a recent version.

Indentation was correct. It was displaced after I performed reformatting inside this post. Also, problem is persistent since version 7.3.