Failed to handle event of type ERROR with this error: null

Need you help on this null pointer exception!

APM Agent used as layers to AWS Lambda:

  • arn:aws:lambda:eu-west-1:267093732750:layer:elastic-apm-extension-ver-1-5-7-arm64:1

  • arn:aws:lambda:eu-west-1:267093732750:layer:elastic-apm-java-ver-1-52-1:1

Logs:

INIT_START Runtime Version: java:21.v27 Runtime Version ARN: arn:aws:lambda:eu-west-1::runtime:2745c98a6107cf976270b76b7e188a389c11f12de05de32ce02983150620f5be
{"log.level":"info","@timestamp":"2025-01-07T16:40:35.158Z","log.origin":{"function":"github.com/elastic/apm-aws-lambda/apmproxy.(*Client).StartReceiver.func1","file.name":"apmproxy/receiver.go","file.line":58},"message":"Extension listening for apm data on :8200","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2025-01-07T16:40:35.158Z","log.origin":{"function":"github.com/elastic/apm-aws-lambda/logsapi.(*Client).startHTTPServer.func1","file.name":"logsapi/subscribe.go","file.line":75},"message":"Extension listening for Lambda Logs API events on 169.254.79.130:35045","ecs.version":"1.6.0"}
LOGS Name: apm-lambda-extension State: Subscribed Types: [Platform, Function]
{"log.level":"info","@timestamp":"2025-01-07T16:40:35.164Z","log.origin":{"function":"github.com/elastic/apm-aws-lambda/app.(*App).processEvent","file.name":"app/run.go","file.line":157},"message":"Waiting for next event...","ecs.version":"1.6.0"}
Picked up JAVA_TOOL_OPTIONS: -javaagent:/opt/elastic-apm-agent.jar -XX:+TieredCompilation -XX:TieredStopAtLevel=1
2025-01-07 16:40:36,443 [main] INFO co.elastic.apm.agent.util.JmxUtils - Found JVM-specific OperatingSystemMXBean interface: com.sun.management.OperatingSystemMXBean
2025-01-07 16:40:36,485 [main] INFO co.elastic.apm.agent.util.JmxUtils - Found JVM-specific ThreadMXBean interface: com.sun.management.ThreadMXBean
2025-01-07 16:40:36,502 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - Starting Elastic APM 1.52.1 as SimulatePartialEarlyRepayment (4) on Java 21.0.5 Runtime version: 21.0.5+11-LTS VM version: 21.0.5+11-LTS (Amazon.com Inc.) Linux 5.10.227-239.884.amzn2.aarch64
2025-01-07 16:40:36,503 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - service_name: 'SimulatePartialEarlyRepayment' (source: Environment Variables)
2025-01-07 16:40:36,503 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - central_config: 'false' (source: Environment Variables)
2025-01-07 16:40:36,503 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - cloud_provider: 'NONE' (source: Environment Variables)
2025-01-07 16:40:36,503 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - activation_method: 'AWS_LAMBDA_LAYER' (source: Environment Variables)
2025-01-07 16:40:36,504 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - secret_token: 'XXXX' (source: Environment Variables)
2025-01-07 16:40:36,504 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - metrics_interval: '0s' (source: Environment Variables)
2025-01-07 16:40:36,504 [main] INFO co.elastic.apm.agent.configuration.StartupInfo - aws_lambda_handler: 'com.xxxxx.aws.jobs.functions.SimulatePartialEarlyRepaymentHandler::handleRequest' (source: Environment Variables)
2025-01-07 16:40:36,505 [main] WARN co.elastic.apm.agent.configuration.StartupInfo - To enable all features and decrease startup time, please configure application_packages
2025-01-07 16:40:38,411 [main] INFO co.elastic.apm.agent.impl.ElasticApmTracer - Tracer switched to RUNNING state
2025-01-07 16:40:38,666 [elastic-apm-server-healthcheck] INFO co.elastic.apm.agent.report.ApmServerHealthChecker - Elastic APM server is available: { "build_date": "2023-02-13T13:01:54Z", "build_sha": "8638b035d700e5e85e376252402b5375e4d4190b", "publish_ready": true, "version": "8.6.2"}
{"ecs":{"version":"1.6.0"},"log":{"level":"INFO"},"message":"Constructor initialized","event":{"created":"2025-01-07T16:40:40.489Z","action":"","category":"aws-lambda","outcome":"success"},"transaction":{"id":"e4ec7819-95a3-46b2-8b62-164b1c8f9c4a"},"AppInfo":{"service":{"name":"","id":""},"name":"","description":"","cmdbId":""}}
EXTENSION Name: apm-lambda-extension State: Ready Events: [INVOKE, SHUTDOWN]
2025-01-07 18:31:31,461 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Failed to handle event of type ERROR with this error: null
2025-01-07 18:31:31,463 [elastic-apm-server-reporter] INFO co.elastic.apm.agent.report.AbstractIntakeApiHandler - Backing off for 9 seconds (+/-10%)
2025-01-07 18:31:31,469 [elastic-apm-init-instrumentation-shutdown-hook] INFO co.elastic.apm.agent.bci.InstrumentationStatsLifecycleListener - Used instrumentation groups: [apache-httpclient, aws-sdk, executor, executor-collection, fork-join, log-correlation, log-error, log-reformatting, logging, process, ssl-context, urlconnection]
2025-01-07 18:31:31,470 [elastic-apm-circuit-breaker] INFO co.elastic.apm.agent.impl.circuitbreaker.CircuitBreaker - Stopping the Circuit Breaker thread.
2025-01-07 18:31:31,469 [elastic-apm-configuration-reloader] ERROR co.elastic.apm.agent.util.ExecutorUtils - null
java.util.concurrent.CancellationException: null
at java.util.concurrent.FutureTask.report(Unknown Source) ~[?:?]
at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:?]
at co.elastic.apm.agent.util.ExecutorUtils.logException(ExecutorUtils.java:217) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.util.ExecutorUtils.access$100(ExecutorUtils.java:41) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.util.ExecutorUtils$1.afterExecute(ExecutorUtils.java:64) ~[elastic-apm-agent.jar:1.52.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at co.elastic.apm.agent.util.ExecutorUtils$2.run(ExecutorUtils.java:99) ~[elastic-apm-agent.jar:1.52.1]
at java.lang.Thread.run(Unknown Source) [?:?]

Hi !

Thanks for reporting this, I have a few questions to help identify where the issue is.

This message indicates the issue seem when the data is sent to the extension:

  • the JVM seem to have run for close to 2 hours without any other error message, apart from this log message did the agent worked as expected and you got the expected data visible in Kibana ?
  • how many transactions were captured by this lambda ?
  • is the issue easily reproducible or is it random ?

Thank you @Sylvain_Juge.

Unfortunately, 0 transactions/logs were sent to the APM server.
No data on Kibana from AWS Lambdas which are organized under an AWS step function.

on every run is the same situation.

Added some additional logs collected in debug mode:

2025-01-08 10:37:00,377 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving ERROR event (sequence 0)

2025-01-08 10:37:05,379 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Failed to handle event of type ERROR with this error: null
2025-01-08 10:37:05,379 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Event handling failure
java.util.concurrent.TimeoutException: null
at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:?]
at co.elastic.apm.agent.impl.metadata.MetaDataFuture.get(MetaDataFuture.java:60) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.impl.metadata.MetaDataFuture.get(MetaDataFuture.java:28) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.serialize.DslJsonSerializer.waitForMetadata(DslJsonSerializer.java:144) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.serialize.DslJsonSerializer.access$300(DslJsonSerializer.java:109) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.serialize.DslJsonSerializer$Writer.blockUntilReady(DslJsonSerializer.java:788) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.AbstractIntakeApiHandler.startRequest(AbstractIntakeApiHandler.java:94) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.IntakeV2ReportingEventHandler.startRequest(IntakeV2ReportingEventHandler.java:210) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.IntakeV2ReportingEventHandler.handleIntakeEvent(IntakeV2ReportingEventHandler.java:144) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.IntakeV2ReportingEventHandler.dispatchEvent(IntakeV2ReportingEventHandler.java:127) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.IntakeV2ReportingEventHandler.onEvent(IntakeV2ReportingEventHandler.java:92) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.IntakeV2ReportingEventHandler.onEvent(IntakeV2ReportingEventHandler.java:39) ~[elastic-apm-agent.jar:1.52.1]
at com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:168) ~[elastic-apm-agent.jar:1.52.1]
at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.util.ExecutorUtils$2.run(ExecutorUtils.java:99) ~[elastic-apm-agent.jar:1.52.1]
at java.lang.Thread.run(Unknown Source) [?:?]

Here the waitForMetadata method indicates that the agent is trying to fetch the service metadata, and here it is reaching a timeout (I think it's about 5s).

Getting service metadata should not take that long usually, can you provide the agent configuration (java properties or env variables) for more details ? Make sure to redact any sensitive information like token/service name for example.

These are the agent configurations as env variable given to each AWS Lambda from the StepFunction.

AWS_LAMBDA_EXEC_WRAPPER: /opt/elastic-apm-handler
ELASTIC_APM_SERVICE_NAME: XXXXPartialYYYYYY
ELASTIC_APM_LAMBDA_APM_SERVER: ....
ELASTIC_APM_SECRET_TOKEN: ....
ELASTIC_APM_SEND_STRATEGY: background
ELASTIC_APM_LOG_LEVEL: DEBUG
ELASTIC_APM_APPLICATION_PACKAGES: com.xxxx

I am not an expert with AWS lambda deployments, but you should probably set the ELASTIC_APM_AWS_LAMBDA_HANDLER environment variable.

From the documentation it is required for AWS Lambda, and while it does allow to properly give a name to the lambda, it seems the agent behavior is also not the same when it's not set. Fetching metadata is something that usually happens in "regular" cloud deployments like EC2 or GCP, and with lambda the metadata should be provided through a different mean rather than the agent trying to fetch it with an HTTP request.

Added suggested env variable ELASTIC_APM_AWS_LAMBDA_HANDLER, but the situation didn't change!

Set logger on TRACE, and noticed the same null exception coming from ExecutorUtils multiple times:

2025-01-13 15:51:41,614 [main] TRACE co.elastic.apm.agent.util.ExecutorUtils - Stack trace related to thread creation:
java.lang.Throwable: null
at co.elastic.apm.agent.util.ExecutorUtils.logThreadCreation(ExecutorUtils.java:141) [elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.util.ExecutorUtils$SingleNamedThreadFactory.newThread(ExecutorUtils.java:130) [elastic-apm-agent.jar:1.52.1]
at java.util.concurrent.ThreadPoolExecutor$Worker.<init>(Unknown Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor.addWorker(Unknown Source) [?:?]
at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) [?:?]
at co.elastic.apm.agent.impl.circuitbreaker.CircuitBreaker.start(CircuitBreaker.java:59) [elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.impl.ElasticApmTracer.startSync(ElasticApmTracer.java:787) [elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.impl.ElasticApmTracer.start(ElasticApmTracer.java:749) [elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.bci.ElasticApmAgent.initialize(ElasticApmAgent.java:166) [elastic-apm-agent.jar:1.52.1]
at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(Unknown Source) ~[?:?]
at java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]
at co.elastic.apm.agent.premain.AgentMain.loadAndInitializeAgent(AgentMain.java:157) [elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.premain.AgentMain.init(AgentMain.java:98) [elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.premain.AgentMain.premain(AgentMain.java:47) [elastic-apm-agent.jar:1.52.1]
at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(Unknown Source) ~[?:?]
at java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]
at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(Unknown Source) [?:?]
at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(Unknown Source) [?:?]

This exception here is, as the exception message explains, only to record the stacktrace where a new thread is being created, this does not relate at all to the other exception that you had previously.

Do you still get the following in the agent logs ?

2025-01-08 10:37:05,379 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Failed to handle event of type ERROR with this error: null
2025-01-08 10:37:05,379 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Event handling failure
java.util.concurrent.TimeoutException: null
at java.util.concurrent.FutureTask.get(Unknown Source) ~[?:?]
at co.elastic.apm.agent.impl.metadata.MetaDataFuture.get(MetaDataFuture.java:60) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.impl.metadata.MetaDataFuture.get(MetaDataFuture.java:28) ~[elastic-apm-agent.jar:1.52.1]
at co.elastic.apm.agent.report.serialize.DslJsonSerializer.waitForMetadata(DslJsonSerializer.java:144) ~[elastic-apm-agent.jar:1.52.1]

Yes, continue to have it!