APM-server autodiscovery

Hi guys! I am using APM 7.15.0 version with kibana same version (ELK stack) We are using on kubernetes, and I like to know if APM supports autodiscovery, we are using it for filebeat metricbeat and heartbeat to add a custom field to identify our application per ID, this is working really very well with those beats but I can't find in the documentation anything related with that feature, is it supported? do you have some documentation to share with me?

Regards!

I don't think the APM agent for PHP does this, as the web servers containing your application require explicit configuration (e.g., changing the php.ini file) to emit traces and spans.

Hi @riferrei how are you? I am talking about apm-server. I am using the official helm chart.

Regards!

Could you please explain a bit more what do you mean by autodiscover in the apm-server? As in, what are you trying to accomplish?

@riferrei let me explain, we are using autodiscovery for filebeat, metricbeat and heartbeat to insert a custom field to be able to filter in our kibana dashboard, so every time we deploy a new application, with autodiscovery enabled we can filter the application and the cluster (we set cluster-id and app-id), our cluster is kubernetes, so every cluster-id, is a kubernetes cluster, that is working well.

  apm-server.yml: |
    apm-server:
      host: "0.0.0.0:8200"
      apm-server.rum.enabled: true
      apm-server.rum.allow_origins: ['*']
      apm-server.kibana.enabled: true
      apm-server.kibana.host: "kibana.${ENVIRONMENT}.example.com"
      apm-server.kibana.protocol: "https"
      apm-server.kibana.username: '${ELASTICSEARCH_USERNAME}'
      apm-server.kibana.password: '${ELASTICSEARCH_PASSWORD}'
    queue: {}
    fields:
      bucketID: '${CLUSTER_ID}'
    fields_under_root: true
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elastic-client-${ENVIRONMENT}.${ENVIRONMENT}.example.com:443"]

For example, this is our apm config, fields statement for cluster_id is working, but the application is dinamic, we deploy a few applications and we need to set this custom field appID every time that an app is deployed, for example, this is our configuration for filebeat:

  filebeat.yml: |
    filebeat.inputs:
    - type: container
      fields:
        bucketID: '${CLUSTER_ID}'
      fields_under_root: true
      paths:
        - /var/log/containers/*.log
      processors:
      - add_kubernetes_metadata:
          host: ${NODE_NAME}
          matchers:
          - logs_path:
              logs_path: "/var/log/containers/"
      - dissect:
          tokenizer: "%{visualize.message}"
          field: "message"
          target_prefix: ""
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true
          templates:
            - condition.regexp:
                kubernetes.labels.appID: '.*'
              config:
                - type: container
                  paths:
                    - /var/log/containers/*-${data.kubernetes.container.id}.log
                  containers.ids:
                    - ${data.kubernetes.container.id}
                  fields:
                    appID: ${data.kubernetes.labels.appID}
                  fields_under_root: true
    output.elasticsearch:
      username: '${ELASTICSEARCH_USERNAME}'
      password: '${ELASTICSEARCH_PASSWORD}'
      protocol: https
      hosts: ["elastic-client-${ENVIRONMENT}.${ENVIRONMENT}.example.com:443"]

As I said before, this is working well for beats, so my question is if we can do the same or similar thing for apm-server, not sure if is supported.

is it more clear?

regards!

I think the confusion here is that you want to create dynamic fields representing clusters and application identifiers at an APM Server level, whereas this should be at the agent level. Even though the APM Server shares the exact configuration details with some other Beats, its relationship with applications is different. While Beats can introspect applications dynamically using the Docker/K8S support; where you can tag things and create dynamic fields, you can't do this in an APM Server level, AFAIK.

With APM, it is the agent that knows the application details. So, for instance, if you want to create a field called hostname that represents which cluster is being used by the app, in the PHP agent, you need to specify this using the ELASTIC_APM_HOSTNAME environment variable. Once the agent collects all the data, it dispatches to the APM Server. At this point, the APM Server only knows the data that has been sent by the agents.

Not sure if that clarifies, and I'm glad to discuss this further if my understanding of the problem is not there yet — but I think what you want to achieve is not supported in the APM Server. Maybe someone from the engineering team can comment otherwise, hopefully.

@riferrei yeah It's clear, so my doubts is because I didn't find in the documentation. but thanks anyway!
I am sure now that I have to work from the agents,

Thanks a lot!

1 Like

In our setup we set the environment variable ELASTIC_APM_ENVIRONMENT to distinguish between our different Kubernetes clusters. We do so by injecting that setting when we create our service deployments as a step in the release pipeline. This ensures that each service gets correctly tagged with the environment we are currently deploying it into.

The ELASTIC_APM_ENVIRONMENT setting is shown as Environment in the top right corner in Kibana APM for quick filtering.

It has worked well for us. Maybe something similar is possible for you.

1 Like

Thanks @Erik_Rydgren yes, we did it, per environment and per service.name, we can filter that, the problem is the APM is isolated from metricbeat-filebeat and heartbeat, because when we build dashboards, we can't use the same filter.
For example we are using in one dashboard metrics, logs and uptime of the application, using "Control" visualization, we filtered per applicationID and works perfect.
But if we want to have panels for transactions or service per minutes , we can't, because we don't have that field in the apm index.
In apm you can use custom_field we used to get the cluster id as I showed above in the message, but autodiscovery is not available.
So we loose that functionality or capacity.

Now I think I understand your situation. You want to take selected labels from your pod deployment and add those labels to the APM events. Then the answer from @riferrei above is correct. This is something that needs to be done on the agent level in each pod since that is the only place where the pod labels exist.

I would suggest that when you apply the deployment labels also set appropriate environment variables to be used where APM transactions are created. Some APM agents already have support for global labels through the setting ELASTIC_APM_GLOBAL_LABELS. These labels would be applied automatically by the agent when an event is created. If your APM agent doesn't support this setting you need to apply the labels yourself.

@Erik_Rydgren yes, I am using for example "service.name" field, I can add my applicationID there, that is working perfect, the only problem is that, when I try to create a dashboard using both indexes (I am using "Control" Panel to get the applicationID) I miss data from panel because, for metricbeat/filebeat I am using a field "appID" and for apm I am using service.name. Using autodiscovery for set this it would be great, but now I understand is not possible.

thanks a lot!