APM Server 9.1.4 Container Always Fails with 401 Despite Valid Credentials

I’m running Elasticsearch, Kibana, and APM Server 9.1.4 in Docker using Docker Compose. All containers are on the same custom bridge network. Elasticsearch shows green cluster health and the apm_system user is enabled with the correct password.

Example checks from the host and Elasticsearch container confirm the credentials work:

curl -u apm_system:<password> http://localhost:9200/_cluster/health?pretty
# Returns cluster health GREEN

However, the APM Server container logs repeatedly show:

precondition failed: error querying cluster_uuid: status_code=401
refresh cache error: context deadline exceeded

  • The container is minimal — it doesn’t have curl, ping, or sh, so I cannot test connectivity from inside.

  • I verified environment variables are set correctly in the container:

APM_ELASTIC_USERNAME=apm_system
APM_ELASTIC_PASSWORD=<password>

  • Kibana is not involved in this setup (no Fleet).

  • Elasticsearch is reachable from host and other containers.

It seems like the APM Server container cannot authenticate with Elasticsearch, even though credentials are correct. Could this be a networking or minimal container issue? Any suggestions for debugging or a reliable way to confirm connectivity from inside the APM container?

Additional Context and Findings

Updated Configuration Attempts:

  • Created a custom apm_writer user with apm_writer_role that has full permissions for APM indices (apm-*, traces-*, metrics-apm-*, logs-apm-*) including create_index, write, manage privileges

  • Updated APM Server environment variables to use the new user: APM_ELASTIC_USERNAME=apm_writer and APM_ELASTIC_PASSWORD=apm_writer_password123!

  • Added cluster-level permissions (monitor, cluster:admin/xpack/monitoring/bulk) to the custom role

  • Verified the custom user can successfully create indices: curl -u apm_writer:apm_writer_password123! -X PUT "localhost:9200/apm-test-index" returns {"acknowledged":true}

Docker Compose Configuration:

apm-server:

environment:

- output.elasticsearch.hosts=\["http://elasticsearch:9200"\]

- output.elasticsearch.username=${APM_ELASTIC_USERNAME}

- output.elasticsearch.password=${APM_ELASTIC_PASSWORD}

- apm-server.host=0.0.0.0:8200

- apm-server.data_streams.wait_for_integration=false

Current Behavior:

  • APM Server accepts requests from .NET applications (returns 202 status codes)

  • No APM data indices are being created in Elasticsearch

  • APM Server remains in "blocking ingestion until all preconditions are satisfied" state

  • The 401 errors persist even with a user that has full cluster and index permissions

Key Questions:

  1. In Elasticsearch 9.1.4, is the apm_system user still the recommended approach, or should we use a custom user with full permissions?

  2. Are there specific cluster-level permissions required for APM Server beyond monitor and cluster:admin/xpack/monitoring/bulk?

  3. Could this be related to the APM Server's internal authentication mechanism or a known issue with the 9.1.4 version?

    Environment Details:

    • Elasticsearch 9.1.4 with security enabled (xpack.security.enabled=true)

    • APM Server 9.1.4

    • Docker Compose with custom bridge network

    • All containers can communicate (verified with other services)