Elasticsearch APM Fleet integration - index per application

fabry_00 · December 31, 2024, 7:52am

We have an elasticsearch cluster with 4 data nodes. We have our FleetServers with APM integration enabled. We have noticed that elasticsearch creates at least an index per app / service for example:

.ds-logs-apm.app.myapp1-2024.12.19-000019
.ds-logs-apm.app.myapp2-2024.12.19-000019
.ds-logs-apm.app.myapp3-2024.12.19-000019
.ds-metrics-apm.app.myapp1-2024.12.19-000019
.ds-metrics-apm.app.myapp2-2024.12.19-000019
.ds-metrics-apm.app.myapp3-2024.12.19-000019

The result is that we hit our shards limits (4000/4000). We were wondering if we can setup the APM integration to save all the metrics and the logs in two big indexes (one for metrics and one for logs) instead of split them per app.

if yes how to do it and what are the cons?

thank you

stephenb · January 1, 2025, 7:34pm

Hi @fabry_00

Before we try to help, there is a couple questions.

What version are you on?

Are you using the Managed Fleet Based APM Server or APM Server binary

See here

And when I looks at the datastream back index...

.ds-logs-apm.app.myapp1-2024.12.19-000019

it seems like it is missing the namespace I would expect

.ds-logs-apm.app.myapp1-default-2024.12.19-000019

So perhaps provide a bit more detail and we can help... and yes you can "collapse" those data streams ... a few caveats but yes....

If you are in 8.x and using fleet managed you can add a pipeline like this... which will collapse the data streams into a data stream

logs-apm.app.default-default

CAUTION Test First

PUT _ingest/pipeline/logs-apm.app@custom
{
  "processors": [
    {
      "set": {
        "if": "ctx.service?.environment == null",
        "field": "service.environment",
        "value": "unknown"
      }
    },
    {
      "set": {
        "field": "custom_pipeline",
        "value": "logs-apm.app@custom"
      }
    },
    {
      "reroute": {
        "dataset": [
          "apm.app.default"
        ],
        "namespace": [
          "default"
        ]
      }
    }
  ]
}

fabry_00 · January 3, 2025, 8:14am

Hi @stephenb ,

We are using ELK 8.16.2
All our cluster and fleet servers are deployed on premise and are using the Fleet-managed APM Server.

When you say "it seems like it is missing the namespace I would expect", I can tell that we have customized the "namespace" configuration in the APM integration with "prod" following the example here Data streams | Fleet and Elastic Agent Guide [8.16] | Elastic.
So, yes the real indexes names are: .ds-logs-apm.app.myapp1-prod-2024.12.19-000019

stephenb · January 3, 2025, 5:37pm

Being precise helps....

Excellent ... so you can try the Pipeline I suggested above just change

        "namespace": [
          "prod"
        ]

and that will put all your APM logs into the single data stream.
logs-apm.app.default-prod
All the Kibana Apps / functionality / etc... will / should still work.

You can change the dataset to your own name.. just make sure it starts with apm.app. example others have done apm.app.all

Make sure you test in Non-Prod

Same approach to metrics should work... there can sometimes be field collision so definitely test in in Non-Prod

Topic		Replies	Views
Elastic APM Index management mystery APM server	5	180	April 24, 2024
Create multiple APM indexes, one per app APM server	7	1850	September 29, 2020
Dynamic namespace on APM integration APM fleet , server , integrations	3	736	June 30, 2022
Creating APM index in Elasticsearch based on event name in APM server 8.7.1 APM ilm-index-lifecycle-management , server	4	396	May 30, 2023
Fleet server & integrations does not provide log/data Elasticsearch docker , fleet	4	820	December 19, 2022

Elasticsearch APM Fleet integration - index per application

Related topics