APM server : Log events filtered out but not spans

Kibana version: 7.17.14

Elasticsearch version: 7.17.14

APM Server version: 7.17.14

APM Agent language and version: OTLP Rust 0.13

Original install method (e.g. download page, yum, deb, from source, etc.) and version: DEB, 7.17.13

Fresh install or upgraded from other version? Upgraded from 7.17.13, but didn't work previously either

Is there anything special in your setup? For example, are you using the Logstash or Kafka outputs? Are you using a load balancer in front of the APM Servers? Have you changed index pattern, generated custom templates, changed agent configuration etc.

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant):

I have ELK + APM server environments, one for dev purposes on my local machine (setup with GitHub - deviantony/docker-elk: The Elastic stack (ELK) powered by Docker and Compose.) and one on the production server (installed from DEBs).
Everything works great in my local docker setup, but in my production environment only the spans and metrics are showing up in the Discovery or APM tabs.

I can't find the difference in config between the two instances. I checked the settings (both from .yml and on the UI) for Elasticsearch, Kibana, APM server and the ingest pipeline (from the UI and from the definitions)

Provide logs and/or server output (if relevant):
Here is an excerpt from the APM server logs :

{"log.level":"debug","@timestamp":"2023-10-16T14:03:24.783Z","log.logger":"publisher","log.origin":{"file.name":"pipeline/client.go","file.line":232},"message":"Pipeline client receives callback 'onFilteredOut' for event: {Timestamp:2023-10-16 14:03:14.1115569 +0000 UTC Meta:{\"pipeline\":\"apm\"} Fields:{\"agent\":{\"name\":\"otlp\",\"version\":\"unknown\"},\"data_stream.type\":\"logs\",\"ecs\":{\"version\":\"1.12.0\"},\"labels\":{\"code_filepath\":\"C:\\\\Users\\\\VssAdministrator\\\\.cargo\\\\registry\\\\src\\\\index.crates.io-6f17d22bba15001f\\\\tower-http-0.4.3\\\\src\\\\trace\\\\on_request.rs\",\"code_lineno\":80,\"code_namespace\":\"tower_http::trace::on_request\",\"level\":\"DEBUG\",\"service_language_name\":\"Rust\",\"target\":\"tower_http::trace::on_request\"},\"message\":\"started processing request\",\"observer\":{\"ephemeral_id\":\"e5604b83-bca9-4b76-8152-515a01944c7a\",\"hostname\":\"REDACTED\",\"id\":\"390b2967-e1f1-497c-9681-cb46fd0a17b2\",\"type\":\"apm-server\",\"version\":\"7.17.14\",\"version_major\":7},\"processor\":{\"event\":\"log\",\"name\":\"log\"},\"service\":{\"environment\":\"REDACTED\",\"framework\":{\"name\":\"opentelemetry-otlp\",\"version\":\"0.13.0\"},\"language\":{\"name\":\"unknown\"},\"name\":\"REDACTED\",\"version\":\"1.0\"},\"trace\":{\"id\":\"2e9e6b9d01ee9ca2bb7b131904d6f3e9\"}} Private:<nil> TimeSeries:false}","service.name":"apm-server","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2023-10-16T14:03:24.783Z","log.logger":"publisher","log.origin":{"file.name":"pipeline/client.go","file.line":232},"message":"Pipeline client receives callback 'onFilteredOut' for event: {Timestamp:2023-10-16 14:03:21.5653449 +0000 UTC Meta:{\"pipeline\":\"apm\"} Fields:{\"agent\":{\"name\":\"otlp\",\"version\":\"unknown\"},\"data_stream.type\":\"logs\",\"ecs\":{\"version\":\"1.12.0\"},\"labels\":{\"code_filepath\":\"C:\\\\Users\\\\VssAdministrator\\\\.cargo\\\\registry\\\\src\\\\index.crates.io-6f17d22bba15001f\\\\tower-http-0.4.3\\\\src\\\\trace\\\\on_response.rs\",\"code_lineno\":114,\"code_namespace\":\"tower_http::trace::on_response\",\"latency\":\"7453 ms\",\"level\":\"DEBUG\",\"service_language_name\":\"Rust\",\"status\":200,\"target\":\"tower_http::trace::on_response\"},\"message\":\"finished processing request\",\"observer\":{\"ephemeral_id\":\"e5604b83-bca9-4b76-8152-515a01944c7a\",\"hostname\":\"REDACTED\",\"id\":\"390b2967-e1f1-497c-9681-cb46fd0a17b2\",\"type\":\"apm-server\",\"version\":\"7.17.14\",\"version_major\":7},\"processor\":{\"event\":\"log\",\"name\":\"log\"},\"service\":{\"environment\":\"REDACTED\",\"framework\":{\"name\":\"opentelemetry-otlp\",\"version\":\"0.13.0\"},\"language\":{\"name\":\"unknown\"},\"name\":\"REDACTED\",\"version\":\"1.0\"},\"trace\":{\"id\":\"2e9e6b9d01ee9ca2bb7b131904d6f3e9\"}} Private:<nil> TimeSeries:false}","service.name":"apm-server","ecs.version":"1.6.0"}

The "APM" pipeline is set from the standard definition :

apm:
  description: Default enrichment for APM events
  processors:
  - pipeline:
      # apm_ingest_timestamp should always come first,
      # ensuring `event.ingested` is set as early as
      # possible.
      name: apm_ingest_timestamp
  - pipeline:
      name: apm_user_agent
  - pipeline:
      name: apm_user_geo
  - pipeline:
      name: apm_remove_span_metadata
  - pipeline:
      name: apm_error_grouping_name
      if: ctx.processor?.event == 'error'
  - pipeline:
      name: apm_metrics_dynamic_template
      if: ctx.processor?.event == 'metric'

# apm_data_stream_migration is not used in the main apm pipeline,
# it is installed for migrating legacy indices to data streams,
# e.g. using the Kibana Upgrade Assistant.
apm_data_stream_migration:
  description: Migrate APM events to data streams
  processors:
  - script:
      if: ctx.processor?.event == 'span' || ctx.processor?.event == 'transaction'
      source: |
        ctx.data_stream = ["type": "traces", "dataset": "apm", "namespace": "migrated"]
  - script:
      if: ctx.processor?.event == 'error'
      source: |
        ctx.data_stream = ["type": "logs", "dataset": "apm.error", "namespace": "migrated"]
  - script:
      if: ctx.processor?.event == 'metric'
      source: |
        String dataset;
        if (ctx["metricset.name"] != "app") {
          dataset = "apm.internal";
        } else {
          String serviceName = ctx.service.name;
          serviceName = serviceName.toLowerCase();
          serviceName = /[\\\/*?"<>| ,#:-]/.matcher(serviceName).replaceAll('_');
          dataset = "apm.app." + serviceName;
        }
        ctx.data_stream = ["type": "metrics", "dataset": dataset, "namespace": "migrated"];
  - set:
      if: ctx.data_stream != null
      field: _index
      value: "{{data_stream.type}}-{{data_stream.dataset}}-{{data_stream.namespace}}"

apm_user_agent:
  description: Add user agent information for APM events
  processors:
  - user_agent:
      field: user_agent.original
      target_field: user_agent
      ignore_missing: true
      ignore_failure: true

apm_user_geo:
  description: Add user geo information for APM events
  processors:
  - geoip:
      database_file: GeoLite2-City.mmdb
      field: client.ip
      target_field: client.geo
      ignore_missing: true
      on_failure:
      - remove:
          field: client.ip
          ignore_missing: true
          ignore_failure: true

apm_ingest_timestamp:
  description: Add an ingest timestamp for APM events
  processors:
  - set:
      if: ctx.processor?.event != 'span'
      field: event.ingested
      value: "{{_ingest.timestamp}}"

apm_remove_span_metadata:
  description: Removes metadata fields available already on the parent transaction, to save storage
  processors:
  - remove:
      if: ctx.processor?.event == 'span'
      field:
      - host
      - process
      - user
      - user_agent
      - container
      - kubernetes
      - service.node
      - service.version
      - service.language
      - service.runtime
      - service.framework
      ignore_missing: true
      ignore_failure: true

apm_error_grouping_name:
  description: Set error.grouping_name for APM error events
  processors:
  - script:
      source: ctx.error.grouping_name = ctx.error.exception[0].message
      if: ctx.error?.exception?.length != null && ctx.error?.exception?.length > 0
  - set:
      field: error.grouping_name
      value: "{{error.log.message}}"
      if: ctx.error?.log?.message != null

# TODO(axw) handle unit in metric descriptions.
# See https://github.com/elastic/elasticsearch/issues/72536
apm_metrics_dynamic_template:
  description: Set dynamic_templates for application metrics
  processors:
  - script:
      if: ctx._metric_descriptions != null
      source: |
        Map dynamic_templates = new HashMap();
        for (entry in ctx._metric_descriptions.entrySet()) {
          String name = entry.getKey();
          Map description = entry.getValue();
          String metric_type = description.type;
          if (metric_type == "histogram") {
            dynamic_templates[name] = "histogram";
          }
        }
        ctx._dynamic_templates = dynamic_templates;
        ctx.remove("_metric_descriptions");

Why are the log event filteredOut but the spans correctly processed and shown in the UI ? How can I find which rule filter out theses events ?

The log line indeed indicates that an event was dropped and another processor was trying to process the nil event afterwards.
I don't see how this would be happening for transaction events with the default apm pipelines. Please query all ingest pipelines from the cluster where you see the failures and review them again.

I queried all ingest pipelines and checked them. Notting seems out of place. No "drop" action, some "remove" but the conditions should not trigger for these events.

Should I look for something in particular ? Should I remove some of them ? I don't want to break something for the others inputs and not be able to fix it after.

Here is the whole result : https:// pastebin .com /i5NESc3v

Thank you for your time !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.