I'm currently working with an Elasticsearch cluster to store and manage trace and metric data from various applications running in a Kubernetes environment. We use OpenTelemetry for instrumentation, collecting metrics and traces, and forwarding them to Elasticsearch via the OpenTelemetry Collector deployed in our cluster. The Collector is configured to send data to our Elasticsearch cluster, which is set up with a Fleet-managed APM server.
Our setup effectively captures and indexes metrics data into separate indices based on application names, thanks to the configuration and capabilities of the OpenTelemetry Collector and Elasticsearch. However, we're facing a challenge with how trace data is indexed. Unlike metrics, all application trace data is being indexed into a single index, despite our need to have them stored in separate indices based on the application names. This is crucial for our monitoring, analysis, and data retention policies, which vary from one application to another.
I've looked through the documentation and tried various configurations but haven't found a clear path to achieve this. Here are the specifics of our current setup:
- Elasticsearch Version: 8.12
- OpenTelemetry Collector Image Version: 0.87.0
- Kubernetes Version: v1.28.5
- APM Server Version: 8.12
I'm seeking guidance or suggestions on how to configure either the OpenTelemetry Collector or Elasticsearch to dynamically index trace data into application-specific indices. Ideally, this would involve parsing the application name from the trace data and using it to direct the data to the corresponding index.
Has anyone tackled a similar challenge or can offer insights into possible solutions?