IllegalStateException on Span.makeCurrent

Kibana version: 7.8.0

Elasticsearch version: 7.8.0

APM Server version: 7.8.0

APM Agent language and version: Java 1.33.0

Browser version: Chrome 103

Is there anything special in your setup? I am trying to write an instrumentation-plugin

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant):
I am trying to write two instrumentation plugins with the new plugin api (GitHub - elastic/apm-agent-java-plugin-example: Example of instrumentation which plugs in to the Elastic Java Agent and instruments an application) to better monitor our applications.

The first one is a Tibco Messagebus instrumentation which also supports Distributed Tracing (using the OpenTelemetry Propagation of the needed headers as described in the docu). This seems to work ok as far as i can see.
The second on is an instrumentation of the Eclipse Milo library (to connect to an OPC-UA server which is a Parameter Server from one can read value of specific sensors). Standalone this also seems to work great.

But when a request comes in via the message bus (over tibco rv) and then this tries to read a value from the OPC-UA the below Error/stacktrace is thrown.
There seems to be an issues to create a subspan in a span in a propagateted context. But when i analyze the code of the opentelemetry plugin in the agent the stacktrace is pointing to this exact case seems to be already taken care of so it shouldn't even get to this place in the agent-code...

Instrumentation Code:
Code which creates the span in my Eclipse Milo Instrumentation:

            Tracer tracer = GlobalOpenTelemetry.get().getTracer("TibcoServer");
            Span span = tracer.spanBuilder("OPC.readValue")
            		.setAttribute("Parameters", ids)
            		.setSpanKind(SpanKind.CLIENT)
            		.startSpan();
            //return the scope object so that it can be closed in the OnMethodExit method
            return span.makeCurrent();

Code which creates the propagated context and coresponding span/transaction in my tibco instrumentation:

			Context ctx = GlobalOpenTelemetry.getPropagators()
					.getTextMapPropagator()
					.extract(Context.current(), message, getter);
			Scope extScope = ctx.makeCurrent();

			String subject = message.getSendSubject();
			// This is the recommended way to obtain the tracer with the Elastic OpenTelemetry bridge
			Tracer tracer = GlobalOpenTelemetry.get().getTracer("TibcoServer");
            Span span = tracer.spanBuilder(subject)
            		.setSpanKind(SpanKind.SERVER)
            		.setAttribute("type", "incoming")
            		.startSpan();

Provide logs and/or server output (if relevant):

2022-08-01 08:48:44,963 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.ElasticApmTracer - Activating OTelBridgeContext[{opentelemetry-trace-span-key=PropagatedSpan{ImmutableSpanContext{traceId=f89ec3078b4998153236c89e758eeb4e, spanId=c6062e33fd5df80b, traceFlags=01, traceState=ArrayBasedTraceState{entries=[es, s:1]}, remote=true, valid=true}}}] on thread 32
2022-08-01 08:48:44,974 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to '' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e) (1)
2022-08-01 08:48:44,976 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.ElasticApmTracer - startTransaction '' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e)
2022-08-01 08:48:44,979 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to '***.Development.Gateway.PCS7.Command.getValue' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e) (2)
2022-08-01 08:48:44,981 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.ElasticApmTracer - Activating OTelBridgeContext[{opentelemetry-trace-span-key=OtelSpan['***.Development.Gateway.PCS7.Command.getValue' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e)]}] on thread 32
2022-08-01 08:48:44,982 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to 'XFAB.DRS.8Z1.Development.Gateway.PCS7.Command.getValue' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e) (3)
2022/08/01 08:48:44.991  - PID: 7420        - LEVEL: 4 COM       - bus                   - CSysServiceRunnerCallback  - Run service in bus thread (synchronous).
2022/08/01 08:48:45.018  - PID: 7420        - LEVEL: 4 COM       - bus                   - CSysRvBusReceiveMessageXF  - Received message on subject "***.Development.Gateway.PCS7.Command.getValue" (_INBOX.0A31AFE4.60758AE6233E26.1) from "null": <MESSAGE><CONTENTS type="L" size="1"><PARAMETER type="A">Abluft/Geb35_PFL/2PFL2/L_PFL_202110_PI04/MEAS.V</PARAMETER></CONTENTS></MESSAGE>
2022/08/01 08:48:45.029  - PID: 7420        - LEVEL: 3 ALWAYS    - service               - CSysPcs7GetValue           - Reading Value for Abluft/Geb35_PFL/2PFL2/L_PFL_202110_PI04/MEAS.V
2022-08-01 08:48:45,145 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.Span - startSpan '' 00-36fe412f61d4515688b24da03686d8e7-42ff443191ef965a-01 (20c385b9)
2022-08-01 08:48:45,145 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to '***.Development.Gateway.PCS7.Command.getValue' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e) (4)
2022-08-01 08:48:45,145 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to '' 00-36fe412f61d4515688b24da03686d8e7-42ff443191ef965a-01 (20c385b9) (1)
2022-08-01 08:48:45,146 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to '***.Development.Gateway.PCS7.Command.getValue' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e) (5)
2022-08-01 08:48:45,146 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to '***.Development.Gateway.PCS7.Command.getValue' 00-36fe412f61d4515688b24da03686d8e7-0cb7b1098ef73d37-01 (17639f9e) (6)
2022-08-01 08:48:45,146 [Tibrv Dispatcher] DEBUG co.elastic.apm.agent.impl.transaction.AbstractSpan - increment references to 'OPC.readValue' 00-36fe412f61d4515688b24da03686d8e7-42ff443191ef965a-01 (20c385b9) (2)
java.lang.IllegalStateException: unexpected context type to upgrade: co.elastic.apm.agent.opentelemetry.sdk.OTelBridgeContext
	at co.elastic.apm.agent.opentelemetry.context.OTelContextStorage.current(OTelContextStorage.java:81)
	at io.opentelemetry.context.Context.current(Context.java:91)
	at io.opentelemetry.context.ImplicitContextKeyed.makeCurrent(ImplicitContextKeyed.java:33)
    at ***.eclipse_milo_apm_plugin.RequestValueInstrumentation$AdviceClass.onEnterHandle(RequestValueInstrumentation.java:62)
...

Hi @Shaoranlaos ,

If I understand correctly your use case:

  • the main entry-point of your application is based on Tibco, which is based on a message bus.
  • you then make calls to another system based on Milo (which I'm not familiar with).

Because neither Tibco or Milo are currently supported by auto-instrumentation, you have created a custom agent plugin for those.

If that's correct, then first we need to focus on Tibco instrumentation:

  • the SpanKind should probably be changed to CONSUMER, as the root span (or the root transaction in Elastic APM) is created when an incoming message is received.
  • the created span scope should be properly closed once the message processing is complete (using a try/finally or try-with-resources ideally).

With only the Tibco instrumentation, are you able to properly capture the spans/transactions as you'd expect when you send relevant messages to Tibco ?

Then, once this first part is working as expected, we can focus on the Milo part:

  • how are the calls made between Tibco and Milo ? is there a "server" part in Milo where the upstream context should be extracted from ?
  • if the calls are made directly from Tibco message processing to Milo, then you should have SpanKind = CLIENT.

Hi @Sylvain_Juge

you understood the use case correctly :slight_smile: .

Some Information on the used technologies (for context):

Tibco: is a messagebus that not only supports event-type message but also so-called requests, which are messages that the client is sending syncronously (so it is waiting for the other site to reply). This is mainly done to request data which is needed immediatly.

Eclipse Milo: Is a client library that supports the generic API definition that is named OPC-UA and this is mainly used in things like Industrial monitoring (so in sensor based monitoring of industrial equipment and environment controls (e.g. climate control or flow regulation of water and other fluids/gases)).

To answer your questions:

  • I am not sure which SpanKind is more correct CONSUMER or SERVER because both events and the above mentioned requests are handled by the same listener interface which makes the definition a bit hard. I will try to differentiate in the listener instrumentation (they have a naming convention for that but i wouldn't like to depend on that).
  • The scope is closed in the @Advice.OnMethodExit. I haven't copied the code for it above because it is exactly the same as in the example.
  • The Tibco instrumentation is working standalone (even if the distributed tracing / context propagation is not - not sure why but that is probably a separate issue). I can see the Request transaction in the Kibana APM.
  • The API-server for the Milo part is propriatary and i can only make api calls against it. So that is an endpoint and there is no context propagation to this server. This api is called from within an incoming request/message over the tibco bus.
  • I tried to call the OPC-UA server via Milo standalone (not from a tibco message context) and that is working fine and also showing up in the Kibana APM as expected.
  • The Exception comes only if i try to create a subspan in the context of the incoming tibco message request.
  • The SpanKind is already CLIENT for the Milo call as seen above (first code example).

I hope this is understandable.

Thanks for this detailed answer, so we can summarize your application architecture with:

+---------------+               +---------------+                       +-------------+
| tibco_client  |               | tibco_handler |                       | milo_server |
+---------------+               +---------------+                       +-------------+
        |                               |                                      |
        | sends request message         |                                      |
        |------------------------------>|                                      |
        |                               |                                      |
        |                               | calls using Milo Client API          |
        |                               |------------------------------------->|
        |                               |                                      |
        |                               |        response from Milo Client API |
        |                               |<-------------------------------------|
        |                               |                                      |
        |         send response message |                                      |
        |<------------------------------|                                      |
        |                               |                                      |

In your case, the fact that Tibco is used in a "request/reply" pattern is just an implementation detail of how it is being used: the client that calls it sends a message and waits for a response (sometimes synchronously), but it could also just send a message and not wait for a response.

Usually for message buses, on the message handling part, it is not possible to distinguish for each use-case, thus we have to assume that we always "consume" a message (and thus use the SpanKind.CONSUMER).

From what I understand the execution of the Tibco handler is single-threaded, which means that you just have to instrument a single method when receiving a message to wrap it with a span.
When the Tibco handler span is "active" in the current thread, then it means that other spans that will be created are children of that span.

If the calls to the Milo API are blocking (synchronous), then you should be able to wrap them with a CLIENT span, because the tibco handler span should be active, then those should be created as children (this is implicitly using a thread-local with the parent span).

Also, you will have to instrument message sending with tibco API, thus you can have:

  1. one span created when receiving a message
  2. one span when calling the Milo API
  3. one span when sending the reply message through the Tibco API

For now, there is no need to create a "propagated context" for now as the created spans are only created on the tibco handler, context propagation will only be required if you also instrument the tibco client, in this case you will have:

  1. one span when the tibco client sends the message (instrumentation will have to propagate the context through the tibco headers)
  2. the 3 spans listed above, the message handling instrumentation will have to use the remote context provided through headers to make it as parent.
  3. one span for waiting the response message

OK i removed all my context propagation stuff from my tibco plugin and tried again.

The MIlo call is not reconignized as a child span of the incoming tibco message transaction/span because i always get the above mention IllegalStateException when it try to make the child span for the milo call the current span (so Span.makeCurrent() fails).

Any idea why that could be? Is that a Bug in the new Plugin-Api or am i forgetting to set something?

P.S. I changed nothing on the above code for the Milo instrumentation.
My Tibco Listener instrumentation is shortened to the following (removing the context propagation read stuff):

            String subject = message.getSendSubject();
            // This is the recommended way to obtain the tracer with the Elastic OpenTelemetry bridge
			Tracer tracer = GlobalOpenTelemetry.get().getTracer("TibcoServer");
            Span span = tracer.spanBuilder(subject)
            		.setSpanKind(SpanKind.CONSUMER)
            		.setAttribute("type", "incoming")
            		.startSpan();
            //return the scope object so that it can be closed in the OnMethodExit method
            return span.makeCurrent();

By chance, could you share your instrumentation plugin code ?

I have tried to reproduce a similar situation with the sample plugin repository that we have, but it works as expected in my case.

What I did is simply to create a "child span" when the parent span (that wraps the HTTP request).
In Kibana, I can see both the transaction for the HTTP request and the child span, there is no visible agent error in the logs.

Can i sent it to you via DM?

Tibco is not an open library and propriatery so you will probably not get it to run (because even the needed client lib is not in a public Maven Repo).
The Eclipse Milo Instrumentation you could test because there is a public server you could work against (see GitHub - eclipse/milo: Eclipse Milo™ - an open source implementation of OPC UA (IEC 62541).).

I sent you 2 links to the code of both plugins via DM.

I don't see any obvious issue in the code you provided.

Before trying to create the span for Milo, could you try to log/print the value of Span.current() ? It should be the span that was created from Tibco instrumentation.

Also, there are a few things that might be worth investigating for your instrumention (but that aren't directly linked to this behavior I think):

  • usage of hasSuperType is quite expensive, you should use it with getTypeMatcherPreFilter (see usages in agent code for examples)
  • AttributeServices.read returns a CompletableFuture, which means what you currently instrument is the time spent creating the reply, which does not includes the asynchronous execution, while the active scope should be propagated within the CompletableFuture execution, as a result the milo span is ended early and is very short. In order to fix that you should wrap the returned CompletableFuture and make it ends the span when it completes, closing the Scope that is still required though.

I tried to print the Span.current() value in the Milo-Instrumentation but that is the part that is throwing the IllegalStateException.

        @Advice.OnMethodEnter(suppress = Throwable.class, inline = false)
        public static Object onEnterHandle(@Advice.Argument(2) List<ReadValueId> readValueIds) {
line 53 >   System.out.println("Before Milo-Span: "+Span.current());
            ...
java.lang.IllegalStateException: unexpected context type to upgrade: co.elastic.apm.agent.opentelemetry.sdk.OTelBridgeContext
	at co.elastic.apm.agent.opentelemetry.context.OTelContextStorage.current(OTelContextStorage.java:81)
	at io.opentelemetry.context.Context.current(Context.java:91)
	at io.opentelemetry.api.trace.Span.current(Span.java:36)
	at ***.eclipse_milo_apm_plugin.RequestValueInstrumentation$AdviceClass.onEnterHandle(RequestValueInstrumentation.java:53)
	...

I also tried to do the same in the TibcoListenerInstrumentation before and after activating the Consumer-Span and there it works...

Before TibcoListenerSpan: PropagatedSpan{ImmutableSpanContext{traceId=00000000000000000000000000000000, spanId=0000000000000000, traceFlags=00, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=false}}
After TibcoListenerSpan: OtelSpan['***.Development.Gateway.PCS7.Command.getValue' 00-8523d7e66d85ae908c29b458287c3b81-599ccf114f9e394c-01 (77cec3a0)]

The "After TibcoListenerSpan" is the output after makeCurrent on the Consumer-Span.

usage of hasSuperType is quite expensive, you should use it with getTypeMatcherPreFilter (see usages in agent code for examples)

i will see what i can find. Could you point me to an instrumentation the uses that?

AttributeServices.read returns a CompletableFuture

Thanks for pointing that out! I completly forgot that the read method is asynchronous... I will see how i can fix that (maybe i can find another method to instrument...).

Also maybe something is completely broken in my setup...

Now i have also the problem that the TibcoSendInstrumentation is no longer working.
I changed it so that it also creates a span (with SpanKind.PRODUCER) and for that i changed the return type of the onEnterHandle method to an Object to return the scope and close it in the onExitHandle method.
Ever since this change it can no longer find the onEnterHandle method because it searches for the one that returns void...

2022-08-05 08:14:35,161 [main] ERROR co.elastic.apm.agent.bci.IndyBootstrap - no such method: ***.tibco_apm_plugin.TibcoSendInstrumentation$AdviceClass.onEnterHandle(TibrvMsg)void/invokeStatic
java.lang.NoSuchMethodException: no such method: ***.tibco_apm_plugin.TibcoSendInstrumentation$AdviceClass.onEnterHandle(TibrvMsg)void/invokeStatic
	at java.lang.invoke.MemberName.makeAccessException(MemberName.java:871) ~[?:1.8.0_282]
	at java.lang.invoke.MemberName$Factory.resolveOrFail(MemberName.java:1003) ~[?:1.8.0_282]
	at java.lang.invoke.MethodHandles$Lookup.resolveOrFail(MethodHandles.java:1386) ~[?:1.8.0_282]
	at java.lang.invoke.MethodHandles$Lookup.findStatic(MethodHandles.java:780) ~[?:1.8.0_282]
	at co.elastic.apm.agent.bci.IndyBootstrap.bootstrap(IndyBootstrap.java:419) [elastic-apm-agent-1.33.0.jar:1.33.0]
	at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_282]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
	at java.lang.IndyBootstrapDispatcher.bootstrap(IndyBootstrapDispatcher.java:60) [?:1.8.0_282]
	at java.lang.invoke.CallSite.makeSite(CallSite.java:310) [?:1.8.0_282]
	at java.lang.invoke.MethodHandleNatives.linkCallSiteImpl(MethodHandleNatives.java:307) [?:1.8.0_282]
	at java.lang.invoke.MethodHandleNatives.linkCallSite(MethodHandleNatives.java:297) [?:1.8.0_282]
	at com.tibco.tibrv.TibrvTransport.send(TibrvTransport.java:148) [tibrvj-8.4.2-x86.jar:?]

I changed the return type back to void and it is working again. I am lost...
This change should also already be in the version i sent you if you would like to take a second look... I certainly don't now what is going on...

Edit:
There seems to be a caching issue.
I also tried to change the definition of the onExitHandle of the Milo-Instrumentation to also include @Advice.Return but now it can no longer find the method because it searches for the old definition without the parameter.
I already tried restarting all i can restart (including my whole pc) but it seems to persist somewhere... Any idea how i can reset the Advice-method definitions?

Edit2:
On seconde thought it couldn't be a problem of my pc and caching because i tried it on another pc and had the same isssues with the not found signatures... and the apm-server has probably nothing to do with this.

Hi,

For me there are two distinct issues:

  • the NoSuchMethodException that you just reported, which seems linked to your instrumentation code changes
  • the IllegalStateException which is preventing from calling Span.current().

For now, the most important one is the latter, thus I think it's better if we keep focused on solving this one first before going further.

One thing that I forgot to ask about your application is about the runtime and packaging.

  • how is the application packaged and deployed ? Is it running on an application server ?
  • do you know if the application relies on OSGi or custom classloading strategies ?

Last but not least, as it's PTO season here it's quite likely that you won't have timely replies to this post in the coming weeks, but keep us posted on your progress anyway.

Yeah we can concentrate on the second one first.

Here the answers to your questions:

  • For now the application is running in my local Eclipse (simply started via run java application). In production the application will simply be run via java -jar .
  • I think the framework we are using uses some custom classloading (i'm not a 100 percent sure about this). It is not OSGI (so no dynamic loading, all must be there on application start) but it has config files where the classnames are specified which should be used for specific subjects on the messagebus and initializes these classes during application start and links them to the specific subject.

Does your application has maven dependencies into the instrumentation plugins by chance ? If that would be the case (even if they aren't somehow included in the packaged .jar), then Eclipse might try to load the instrumentation in the wrong classloader by putting them in the application classpath.

What I usually suggest here is to have two distinct IDE projects:

  • one with the instrumentation plugins (1)
  • one with the application (2)

(1) will have compile-time dependencies to the application, the packaged artifacts won't include the application artifacts (because they are in provided scope).
(2) will not have any declared dependency in pom.xml to the instrumentation plugins nor the agent if you use -javaagent setup method, and only apm-agent-attach if you use programmatic self-attach.

There are two things worth trying to investigate further:

  • package the application and instrumentation plugins and run it in a similar way at it is in production to see if the issue is still triggered.
  • use this agent snapshot that will provide a bit more information about the issue.

Now i fell dump...
You are right i had the instrumentation-plugins referenced in the application (because of packaging needs) in an older version and so it obviously had the old classes also in the classloader. Change it to use the local snapshot version and all changes on the plugins are detected correctly (No NoSuchMethodException any more).

Back to square one and my IllegalStateException...
Could you specify what should change with the snapshot version of the agent? I don't see any changes in the Logs (at least not on DEBUG level) when i execute it in Eclipse.
Edit: Here the apm-agent.log from my request: elastic-apm.log · GitHub

I will try to run the app standalone and will see if that changes anything.
Edit2: Ok i tried running the application outside of Eclipse (via simple java command to start the app). Unfortunately nothing changed to the above mentioned Exception when it was running inside Eclipse.

java.lang.IllegalStateException: unexpected context type to upgrade: co.elastic.apm.agent.opentelemetry.sdk.OTelBridgeContext
        at co.elastic.apm.agent.opentelemetry.context.OTelContextStorage.current(OTelContextStorage.java:81)
        at io.opentelemetry.context.Context.current(Context.java:91)
        at io.opentelemetry.api.trace.Span.current(Span.java:36)
        at ***.eclipse_milo_apm_plugin.RequestValueInstrumentation$AdviceClass.onEnterHandle(RequestValueInstrumentation.java:72)
        at org.eclipse.milo.opcua.sdk.client.OpcUaClient.read(OpcUaClient.java:513)
        at org.eclipse.milo.opcua.sdk.client.nodes.UaNode.readAttributeAsync(UaNode.java:599)
        at org.eclipse.milo.opcua.sdk.client.nodes.UaNode.readAttribute(UaNode.java:558)
        at org.eclipse.milo.opcua.sdk.client.nodes.UaVariableNode.readValue(UaVariableNode.java:306)
        at ***.application.server.Opc.testConnection(XfabOpc.java:196)
        at ***.application.server.Opc.getClient(XfabOpc.java:177)
        at ***.application.server.Opc.getValue(XfabOpc.java:252)
        at ***.application.server.services.CSysPcs7GetValue.runNow(CSysPcs7GetValue.java:25)
        at de.systemagmbh.products.clientserver.standard.service.ASysServerService.run(ASysServerService.java)
        at de.systemagmbh.products.clientserver.standard.service.ASysServerService.run(ASysServerService.java)
        at de.systemagmbh.tools.bus.CSysServiceRunner.run(CSysServiceRunner.java)
        at de.systemagmbh.tools.bus.ASysServiceRunnerCallback.startServiceRunner(ASysServiceRunnerCallback.java)
        at de.systemagmbh.components.bus.tibrv.CSysServiceRunnerCallback.<init>(CSysServiceRunnerCallback.java)
        at de.systemagmbh.components.bus.tibrv.ASysServiceHandler.onMsg(ASysServiceHandler.java)
        at com.tibco.tibrv.TibrvListener.invoke(TibrvListener.java:142)
        at com.tibco.tibrv.TibrvImplQueueC.natDispatch(Native Method)
        at com.tibco.tibrv.TibrvImplQueueC.dispatch(TibrvImplQueueC.java:44)
        at com.tibco.tibrv.TibrvQueue.dispatch(TibrvQueue.java:301)
        at com.tibco.tibrv.TibrvDispatcher.run(TibrvDispatcher.java:169)

Any other ideas?

Small Update on this issue:
I tried to use only one plugin and that works without problem.
Even if i create the Transaction with the help of the public api (via @CaptureTransaction) the span of the plugin is correctly attached to this manually created transaction.

Only if i use both plugins at the same time the problem occurs and then it is irrelevant if the second plugin creates its span within the span of the other plugin or only after the first plugin already closed its span again (i tried that with reversing the calling order of the plugins so that the eclipse milo makes a span for its request and closes it again and only after that the tibco plugin tries to create a span to publish the result on the message bus all within a manually created Transaction like above mentioned).

Hi,

I'm just back to work, and I haven't forgotten about your issue here.

Your last message gave me an idea of a potential issue we might have: from what I understand if you use only one of your plugins everything works as expected, but when you try to use them both at the same time this IllegalStateException occurs.

I am not very familiar with details on external plugins classloading and I need to investigate that, but it could definitely be possible that the OTelBridgeContext class would be loaded in two distinct distinct classloaders (one for each external plugin), which could explain this behavior.

Good news, I have managed to reproduce it locally :tada: by just creating two distinct external plugins and trying to use them at the same time.

Surprisingly, this is the first report we have of this ussue, thus I am guessing that most users of external plugins only use it with a single one so far.

1 Like

Hi @Shaoranlaos , I have identified the underlying cause of the issue and have re-purposed the PR #2735 with a first version of the fix.

Could you try this snapshot an tell me if that now works as expected within your application ?

In the mean time, I will be adding more tests to cover this case properly in the future and prevent further regressions.

I have done a short test and it seems to work :tada:

I will see if i can test a bit more in the next days but i will go on a 2 weeks vacation starting next week so it could be you will not hear from me until the 12th or 13th of September.