Monitoring for In-flight Requests / Jobs

Hi guys!

Is there any way to monitor in-flight requests (a.k.a. requests that are current running, but not finished)?

I don't think there is, so I opened an github issue suggesting that this would be a nice addition to the project (https://github.com/elastic/apm-agent-java/issues/1055), but it was closed, suggesting to open a discussion here.

Basically, I just want to know what exactly is running is the server in real time...
This is extremely important, in my opinion, for real time monitoring.
If we know exactly what is currently running in a server, we can quickly discover (or at least have an idea of):

  1. What is causing a CPU / memory spike;
  2. If there's something stuck in the server;
  3. If there are long-running requests;

Is there any way of doing that, currently?
If there's not, do you think it would be reasonable to implement it?

Thanks!

Hey Daniel :wave:
The way the agent normally works is ending (and sending) transactions and spans to the APM server only after they are ended. However, we are just about to release a very cool feature that analyses threads' stack traces in the background and infers spans representing the long executing methods. While this feature on its own will not satisfy exactly what you need (as it relies on transaction to be deactivated on the thread before analysing the related call tree) , it does provide the technological mechanism that can enable that. This still doesn't mean we are going to implement it soon, but it is interesting :slight_smile: .
cc @felixbarny

Hey, Eyal.

Really cool feature, indeed... I'll probably use it in production :smiley:.
Still, as you said, it does not satisfy exactly what I was thinking of.
If I understood the feature correctly, it sends the spans as they're done, without having to wait for the whole transaction to end.

In my case, I would like to monitor the transaction itself without having to wait for it to finish.

The following info could be searchable while the request was still in-flight:

transaction.id
transaction.name
transaction.status (running, aborted or completed)
transaction.start_timestamp

And if the transaction is a HTTP Request:

url.domain
url.full
url.path
url.port
url.scheme

Any info that was already attached to the transaction (via the public api) could also be sent.

One problem I can think of is if the server dies abruptly while there are transactions that are still running.
There would have to be some way of telling the server that the request was aborted / the system died (maybe a ping to the apm-server every N seconds, saying that the transaction is still alive).

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.