Indexing Application Trace Data into Separate Indices Based on Application Names

I'm currently working with an Elasticsearch cluster to store and manage trace and metric data from various applications running in a Kubernetes environment. We use OpenTelemetry for instrumentation, collecting metrics and traces, and forwarding them to Elasticsearch via the OpenTelemetry Collector deployed in our cluster. The Collector is configured to send data to our Elasticsearch cluster, which is set up with a Fleet-managed APM server.

Our setup effectively captures and indexes metrics data into separate indices based on application names, thanks to the configuration and capabilities of the OpenTelemetry Collector and Elasticsearch. However, we're facing a challenge with how trace data is indexed. Unlike metrics, all application trace data is being indexed into a single index, despite our need to have them stored in separate indices based on the application names. This is crucial for our monitoring, analysis, and data retention policies, which vary from one application to another.

I've looked through the documentation and tried various configurations but haven't found a clear path to achieve this. Here are the specifics of our current setup:

  • Elasticsearch Version: 8.12
  • OpenTelemetry Collector Image Version: 0.87.0
  • Kubernetes Version: v1.28.5
  • APM Server Version: 8.12

I'm seeking guidance or suggestions on how to configure either the OpenTelemetry Collector or Elasticsearch to dynamically index trace data into application-specific indices. Ideally, this would involve parsing the application name from the trace data and using it to direct the data to the corresponding index.

Has anyone tackled a similar challenge or can offer insights into possible solutions?

Hi @Sayyad_Seyidli ,

If I understand correctly, you configured the OTel Collector to send to the APM server in your deployment, and the APM Server is responsible for receiving OTel data and sending them to Elasticsearch.

There is a similar use case in [OpenTelemetry] data_stream.namepace and data_stream.dataset aren't being respected · Issue #10191 · elastic/apm-server · GitHub and an existing way to achieve what you want to do is to use the reroute processor as mentioned in the issue. From 8.13 (which is not released at the time of writing), data routing can be controlled via OTel attributes thanks to this change: Respect data_stream.{dataset,namespace} for OTel logs, metrics, and traces by carsonip · Pull Request #201 · elastic/apm-data · GitHub

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.