Hey Elastic APM Team,
I would like to report a bug for: GitHub - elastic/apm-agent-java: Elastic APM Java Agent (apm-agent-plugins/apm-grpc/apm-grpc-plugin)
APM Agent language and version: Java, 1.39.0
tl;dr
A warning will be shown for every grpc-request, if those 2 dependencies are both used in a project:
- GitHub - elastic/apm-agent-java: Elastic APM Java Agent (apm-agent-plugins/apm-grpc/apm-grpc-plugin, newest version 1.39.0)
- GitHub - yidongnan/grpc-spring-boot-starter: Spring Boot starter module for gRPC framework. (newest version 2.14.0.RELEASE) -- The most commonly used spring-boot / grpc integration
I do not think it is really the fault of the second dependency, I debugged into the problem a bit and I think this dependency uses a feature of the official grpc library, that is not commonly used otherwise.
Bug description
The first GRPC request of a freshly started micro-service throws the following exception:
WARN co.elastic.apm.agent.impl.transaction.AbstractSpan - End has already been called: 'de.testservice.TestService/> testRequest' 00-00000000000000000000000000000000-0000000000000000-00 (000000)
java.lang.Throwable: null
at co.elastic.apm.agent.impl.transaction.AbstractSpan.end(AbstractSpan.java:584) ~[elastic-apm-agent-00000000000000000000000000000000-00000000000000000000000000000000.jar:1.39.0]
at co.elastic.apm.agent.impl.transaction.AbstractSpan.end(AbstractSpan.java:559) ~[elastic-apm-agent-00000000000000000000000000000000-00000000000000000000000000000000.jar:1.39.0]
at co.elastic.apm.agent.grpc.GrpcHelper.exitServerListenerMethod(GrpcHelper.java:272) ~[elastic-apm-agent-00000000000000000000000000000000-00000000000000000000000000000000.jar:1.39.0]
at co.elastic.apm.agent.grpc.ServerCallListenerInstrumentation$OnComplete$OnCompleteAdvice.onExit(ServerCallListenerInstrumentation.java:166) > ~[elastic-apm-agent-00000000000000000000000000000000-00000000000000000000000000000000.jar:1.39.0]
at io.grpc.Contexts$ContextualizedServerCallListener.onComplete(Contexts.java:110) ~[grpc-api-1.51.0.jar!/:1.51.0]
at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.closedInternal(ServerCallImpl.java:375) ~[grpc-core-1.51.0.jar!/:1.51.0]
at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.closed(ServerCallImpl.java:365) ~[grpc-core-1.51.0.jar!/:1.51.0]
at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1Closed.runInContext(ServerImpl.java:923) ~[grpc-core-1.51.0.jar!/:1.51.0]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[grpc-core-1.51.0.jar!/:1.51.0]
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ~[grpc-core-1.51.0.jar!/:1.51.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Every other request just logs a warning:
WARN co.elastic.apm.agent.impl.transaction.AbstractSpan - End has already been called: 'de.testservice.TestService/> testRequest' 00-00000000000000000000000000000000-0000000000000000-00 (000000)
Minimal reproducible example
I tried to create a service of minimal size, that can be used to reproduce the bug. To start the code, set the environment variable ELASTIC_APM_SECRET_TOKEN and then:
mvn clean install spring-boot:run
The sourcecode can be found here: 5.189.140.198/test-service.zip
My thoughts on what goes wrong
I tried to debug a bit into it myself, because I thought at first the bug might be in the other dependency. I think the problem is as follows:
grpc-spring-boot-starter uses the officially supported method:
io.grpc.Contexts.interceptCall
to integrate grpc into spring-boot. This constructs a ContextualizedServerCallListener, which overrides the method onComplete, but also calls super.onComplete in this overwritten method.
But now
co.elastic.apm.agent.grpc.OnComplete.OnCompleteAdvice
gets called twice with the same listener, once for the onComplete and once for the super.onComplete (instead of just once as usually, if another listener is used)
In the debugger we now see the following invocations:
GrpcHelper.getInstance().enterServerListenerMethod(A);
GrpcHelper.getInstance().enterServerListenerMethod(A);
GrpcHelper.getInstance().exitServerListenerMethod(thrown, A, transaction, Status.CANCELLED);
GrpcHelper.getInstance().exitServerListenerMethod(thrown, A, transaction, Status.CANCELLED);
The last exitServerListenerMethod invocation now writes the warning, because A was already exited before.
I am not 100% sure that this is the problem, but it looks like if I debug into it with and without the grpc-spring-boot-starter dependency in use.
Conclusion
For now, I would be happy if you could confirm or deny that bug, because we see a lot of those warnings in all our micro-services. If you need anything else from me, I am happy to help.
Kind regards,
Hilmar