Why is the stack exception captured by the apm agent different from the format in the log

log file format

apm Span stack details

I can't understand the stack information provided by apm.

The screenshot you pasted from Kibana (the 2. one) is a stacktrace for a given span and not for a given error - if there is also an error associated with that span then there should be another stack trace on the error itself - but the 2 stack traces you show can't be the same - the 1. one shows an exception, the 2. doesn't.

Now, having said that - there is still room for improvement on capturing these stack traces. First of all we could do better with async methods - there is an issue for that already in the agent repo.

Also, on the 2. screenshot I see that an HTTP request happens - so that is an automatically captured span for an outgoing HTTP request. Unfortunately the way we capture the HTTP request does not give us a callstack that'd also contain the user code - we only see the stack after the async call. To improve this I opened this issue.

Nevertheless for the exception I'd expect an error to show up and the callstack and that error should be very similar to what you show on your 1. screenshot.

@GregKalapos
Thank you for your reply, sorry I did not find a picture consistent with the apm stack, so I put a similar picture.

Indeed, I need HTTP request callstack.

I want to see the cause of a back-end 500 error, apm does not show me the stack information similar to the log file, but we are used to looking at the stack format in the log.

Finally, do I need to wait for the next version to see the "stack format in the log file"?

Hi @wajika

I think you asked also on the GitHub PR, but let’s just also follow up here.

So, yeah, first of all there were some stack trace related PRs merged, and in the next release you’ll be able to see where in your code the outgoing HTTP request happens.

Now, I’d like to also add some comments to the screenshot you sent:

As it seems the outgoing HTTP request itself returned HTTP200, but your service (the POST IoT/CheckAndGetProductInfo) returned HTTP 500.

Of course I don’t know the reason for that, but I would like to mention that if you just set the return type to HTTP500 in your service but no exception leaves the pipeline then the agent won’t be able to capture the error. Similarly if you for example catch every single exception in your service and just return HTTP500 then no exception will leave the pipeline and we won’t be able capture any exception either.

In those cases the easiest is to just capture the exception manually when you handle it. Here is some doc on it.

So if you have some global error handling part, that makes sure no exception leaves the pipeline and your service just return HTTP500, you can do something like this:

catch (Exception e) // Some global error handler
{
   Agent.Tracer.CurrentTransaction?.CaptureException(e); 
   // rest of your code
}

That code will add the exception to your transaction and it'll show up on the UI.

@GregKalapos

Thank you for your reply.

Sorry. I don't understand what you mean

question 1

As it seems the outgoing HTTP request itself returned HTTP200, but your service (the POST IoT/CheckAndGetProductInfo) returned HTTP 500.

In this case, does APM consider it a success?

question 2

no exception leaves the pipeline then the agent won’t be able to capture the error
AND
Similarly if you for example catch every single exception in your service and just return HTTP500 then no exception will leave the pipeline and we won’t be able capture any exception either.

I didn't understand

I still have one thing I do n’t understand, the service generated an http500 error, why is it showing the error code of apm agent?

Hi @wajika

Question1:

What will happen is that the StatusCode will be set to HTTP500, which the UI will show with a red background, but no error will be captured.

Question2:

Let me illustrate this with some code. Let’s say you have something like this:


app.Run(async context =>
{
   context.Response.StatusCode = (int)HttpStatusCode.InternalServerError;
   await context.Response.WriteAsync("Hello, World!");
});

Now the context.Response.StatusCode = (int)HttpStatusCode.InternalServerError; could be anywhere… if you have let’s say ASP.NET Core MVC and in a controller method you do something like this, then it’s the same:


public IActionResult Index()

{

   try
   {
      //Do some work
   }
   catch (Exception e)
   {
      return StatusCode(500);
   }

   return View();
}

No exception leaves the pipeline in those cases, so the agent has no chance to capture it for you. It’ll capture the HTTP500 which is the response code of the request, but won’t capture an error, since there was no error in the pipeline - in the 1. snippet there is no exception et al, in the 2. one you handled it. That’s why I suggested capturing the error manually in my previous comment.

On the other hand, if you do this:


app.Run(async context =>
{
   throw new Exception();
});

or this:


public IActionResult Index()
{
   try
   {
      //Do some work
   }
   catch (Exception e)
   {
      throw;
   }

   return View();
}

Then there is an exception leaving the pipeline so the agent will observe that and show an error on the APM UI.

I still have one thing I do n’t understand, the service generated an http500 error, why is it showing the error code of apm agent?

Sorry, I don’t fully understand. If your question is why you see APM code on the callstack then it’s because the agent subscribes to some internal events therefore at the point of the stack trace capturing the agent is already on the callstack. We don’t trim those from the callstack - we show you the real callstack and since there are agent frames on the callstack those just show up.

If your question is different feel free to elaborate.