Span duration of an HTTP GET calculated wrongly?

APM Server version: 8.9.1

APM Agent language and version: Java 1.49.0

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant):

While debugging a long running transaction we found that one of the methods we implemented ourselves took 50s while basically all it does is one HTTP GET call to a Keycloak REST endpoint returning all users know to the identity provider which according to the Timeline view only took 200ms:

ElasticAPM_HTTP_chunked

Technically we use the Keycloak.findAllUsers() method from org.keycloak:keycloak-admin-client:23.0.7 that internally uses org.jboss.resteasy:resteasy-client:6.2.4.Final for the actual HTTP requests.

Performing the same GET /admin/realms/test/users?max=6000 in Postman takes the full 50s to return all users, not 200ms as seen in the Timeline view above.

The response from Keycloak is streamed (transfer-encoding: chunked) and my assumption now is that only the first chunk is taken into account for the duration of the HTTP GET span in the screenshot above.

Am I right with this assumption?
And if yes, is there something to be done about it as it led to quite some confusion (we actually thought changing the code to see if a different Java HTTP client implementation would be faster).

Steps to reproduce:

  1. trigger a REST endpoint in a Spring Boot application using the APM Java agent that calls Keycloak.findAllUsers() in a method annotated with @CaptureSpan and otherwise has no other logic in it (use a significantly large number of existing users in Keycloak; e.g. 2000)
  2. check the corresponding transaction under Observability > APM
    • the HTTP span should make up basically all of the annotated Java method but doesn't

The duration of HTTP Client spans is the time to first byte.
This has several reasons:

  • From instrumentation perspective we have no guarantee whether the client will actually consume the response body: If it doesn't we would end up with a started but never stopped Span
  • For many HTTP-clients it is very difficult to find an instrumentation hook for when the response has been consumed
  • The duration until the response body has been fully consumed is actually also heavily dependent on the client/consumer: Because the data is streamed, the consumer is a big factor in the pace until the response has been fully consumed. E.g. if you issue DB calls (or do something else slowly) synchronously while streaming the response data you'd also end up with a duration which has nothing todo with the speed of the server or network.

If you are interested in the total time until the response has been fully received and processed, you need to do exactly what you suggested: manually instrument a method which finished after the response has been consumed.

Thanks for the detailed answer, Jonas!

That makes a lot of sense and we were just off the track with our interpretation. :wink: