Auto instrumention - Metrics work, traces not received

We want to trace all of our applications (built with OutSystems lowcode platform). So we used the auto instrumentation profiler for IIS 10.0. We are able to receive metrics, but traces are missing. The logs showed us that it is loading/tracing 600+ modules and we saw sql executescaler traces in the logs. So we believe it is able to hook into our applications.

After adding the apm listener to the web.config of one of our applications we noticed that threads are being aborted on reading centralconfig and I assume the API function for sending the trace to Elastic.

Kibana version: 7.16.3

Elasticsearch version: 7.16.3

APM Server version: 7.16.2

APM Agent language and version: 1.13.0 CLR profiler

Original install method (e.g. download page, yum, deb, from source, etc.) and version: from source (github apm agent)

Fresh install or upgraded from other version? Fresh

Is there anything special in your setup? No. Just IIS10
Runtime name & version: .NET Framework 4.8.4420.0
Framework name: ASP.NET

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant): We are only receiving metrics from the agent. No apm traces. After adding the tracelistener to the web.config we saw these errors occuring:

[2022-01-27 09:41:35.706 +01:00][Error] - {ExceptionUtils.DoSwallowingExceptions} `ElasticApmCentralConfigurationFetcher' (managed ID: 68) thread entry method is about to exit because of exception+-> Exception: System.Threading.ThreadAbortException: Thread was being aborted.
   at System.Threading.Monitor.ObjWait(Boolean exitContext, Int32 millisecondsTimeout, Object obj)
   at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeout(Int32 millisecondsTimeout, UInt32 startTime, CancellationToken cancellationToken)
   at System.Threading.SemaphoreSlim.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Collections.Concurrent.BlockingCollection`1.TryTakeWithNoTimeValidation(T& item, Int32 millisecondsTimeout, CancellationToken cancellationToken, CancellationTokenSource combinedTokenSource)
   at System.Collections.Concurrent.BlockingCollection`1.<GetConsumingEnumerable>d__68.MoveNext()
   at Elastic.Apm.Helpers.SingleThreadTaskScheduler.<ThreadMain>b__14_0() in C:\Users\jenkins\workspace\net_apm-agent-dotnet-mbp_PR-1594\apm-agent-dotnet\src\Elastic.Apm\Helpers\SingleThreadTaskScheduler.cs:line 50
   at Elastic.Apm.Helpers.ExceptionUtils.DoSwallowingExceptions(IApmLogger loggerArg, Action action, Boolean shouldSwallowCancellation, String dbgCallerMethodName) in C:\Users\jenkins\workspace\net_apm-agent-dotnet-mbp_PR-1594\apm-agent-dotnet\src\Elastic.Apm\Helpers\ExceptionUtils.cs:line 29

[2022-01-27 09:41:35.706 +01:00][Error] - {ExceptionUtils.DoSwallowingExceptions} `ElasticApmPayloadSenderV2' (managed ID: 67) thread entry method is about to exit because of exception+-> Exception: System.Threading.ThreadAbortException: Thread was being aborted.
   at System.Threading.Monitor.ObjWait(Boolean exitContext, Int32 millisecondsTimeout, Object obj)
   at System.Threading.SemaphoreSlim.WaitUntilCountOrTimeout(Int32 millisecondsTimeout, UInt32 startTime, CancellationToken cancellationToken)
   at System.Threading.SemaphoreSlim.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Collections.Concurrent.BlockingCollection`1.TryTakeWithNoTimeValidation(T& item, Int32 millisecondsTimeout, CancellationToken cancellationToken, CancellationTokenSource combinedTokenSource)
   at System.Collections.Concurrent.BlockingCollection`1.<GetConsumingEnumerable>d__68.MoveNext()
   at Elastic.Apm.Helpers.SingleThreadTaskScheduler.<ThreadMain>b__14_0() in C:\Users\jenkins\workspace\net_apm-agent-dotnet-mbp_PR-1594\apm-agent-dotnet\src\Elastic.Apm\Helpers\SingleThreadTaskScheduler.cs:line 50
   at Elastic.Apm.Helpers.ExceptionUtils.DoSwallowingExceptions(IApmLogger loggerArg, Action action, Boolean shouldSwallowCancellation, String dbgCallerMethodName) in C:\Users\jenkins\workspace\net_apm-agent-dotnet-mbp_PR-1594\apm-agent-dotnet\src\Elastic.Apm\Helpers\ExceptionUtils.cs:line 29

Steps to reproduce:

  1. We used this configuration to install the profiler:
$appcmd = "$($env:systemroot)\system32\inetsrv\AppCmd.exe"
$appPool = "Apppoolname" 
$profilerHomeDir = "E:\ElasticStack\elastic_apm_profiler" 
$environment = @{
  COR_ENABLE_PROFILING = "1"
  COR_PROFILER_PATH = "$profilerHomeDir\elastic_apm_profiler.dll"
  COR_PROFILER = "{FA65FE15-F085-4681-9B20-95E04F6C03CC}"
  COMPlus_LoaderOptimization = "1" 
  ELASTIC_APM_PROFILER_HOME = "$profilerHomeDir\"
  ELASTIC_APM_PROFILER_INTEGRATIONS = "$profilerHomeDir\integrations.yml"
  ELASTIC_APM_PROFILER_LOG_DIR="$profilerHomeDir\logs"
  ELASTIC_APM_PROFILER_LOG="trace"
  ELASTIC_APM_PROFILER_LOG_IL="1"
}

$environment.Keys | ForEach-Object {
  write-output "[name='$appPool'].environmentVariables.[name='$_',value='$($environment[$_])']"
  #delete entry
  & $appcmd set config -section:system.applicationHost/applicationPools /-"[name='$appPool'].environmentVariables.[name='$_']"
  #add entry
  & $appcmd set config -section:system.applicationHost/applicationPools /+"[name='$appPool'].environmentVariables.[name='$_',value='$($environment[$_])']" 
  
}
& $appcmd list config -section:system.applicationHost/applicationPools/[name='$appPool']
# edit webconfig
<?xml version="1.0" encoding="utf-8"?>
<configuration>
    <appSettings>
    <add key="ElasticApm:ServerUrl" value="https://********.apm.westeurope.azure.elastic-cloud.com" />
    <add key="ElasticApm:SecretToken" value="********" />
    <add key="ElasticApm:Recording" value="true" />
    <add key="ElasticApm:Enabled" value="true" />
    <add key="ElasticApm:ServiceName" value="HyperionDevAPM" />
    <add key="ElasticApm:HostName" value="hyperion-dev" />
    <add key="ElasticApm:Environment" value="Hyperion Development" />	
    </appSettings>
</configuration>

Hope this is enough information to figure out the issue or what we did wrong. Please let me know if more information is required. Thanks in advance for any assistance!

No one with any suggestions, clues?

An issue that I have seen before that results in metrics received but traces not, is when the application also creates the agent, either through registering the ElasticApmModule in web.config, or by performing an action (typically in Application_Start()) that initializes an instance of the agent, which conflicts with the agent instance created by the actions of the profiler. This will be diagnosable in the APM agent log (not the profiler logs) by a log entry with the message

The singleton APM agent has already been instantiated and can no longer be configured. Reusing existing instance

The resolution is to remove the code/config that creates the agent from the application.

Thanks a lot Russ!

That makes a lot of sense..I thought I disabled all code/applications that instantiated APM through a .Net library. I'll go over it again, perhaps some code was left behind or an older version is still active.

We switched to the new fleet implementation and have some upgrade issues to overcome before we can test it again.

Thanks for the help

It can sometimes come about very surreptitiously :smile:

For more context, the profiler injects a shim assembly that loads a supporting managed assembly containing the code for auto instrumentation. One of the first operations it does is to ensure the singleton agent instance is initialized:

In the case of .NET Framework running in a hosted environment e.g. IIS, it defers instantiation to the ASP.NET integration, which in turn, will end up instantiating the agent in the ElasticApmModule that it registers in the ASP.NET pipeline:

This is primarily done to allow the agent to be configured with ASP.NET specific components.

Now, if for some reason, you wanted to explicitly register ElasticApmModule yourself in web.config, and ultimately control the instantiation of the agent instead of leaving it to the profiler, you could as an alternative, disable the ASP.NET auto instrumentation integration with the Elastic APM profiler environment variable:

ELASTIC_APM_PROFILER_EXCLUDE_INTEGRATIONS=AspNet

Thanks for the great insights!

We have the following setups:

  1. Hosted environments where we do not have control of the Front-ends. For this, we created a .Net library so that we can instantiate the agent through Application_Start.
  2. On-premise environments. Here we just want to use the Auto-instrumentation, so we used the setup as I displayed in my initial post. However, the test environment is also where we developed the .Net library and demo applications that initiate the Agent. I disabled all of these applications, so they can't initiate the agent any longer.

However, I still get the Singleton error, can't find yet what is causing this. Can you confirm that the setup we did is the correct way of auto instrumenting ASP.Net applications, or did we miss something?

Here is part of the profiler logs:

Initialize part:

[2022-02-04T07:47:26.097466000+01:00] [TRACE] Initialize: started. profiler package version 1.13.0-beta.1 (commit: 33c296c3ccc2f3158915d8b3cb67cc1db9397e78)
[2022-02-04T07:47:26.098146300+01:00] [DEBUG] Environment variables
 APP_POOL_ID="OutSystemsApplications"
 COMPlus_LoaderOptimization="1"
 COR_PROFILER_PATH="E:\ElasticStack\elastic_apm_profiler\elastic_apm_profiler.dll"
 ELASTIC_APM_PROFILER_INTEGRATIONS="E:\ElasticStack\elastic_apm_profiler\integrations.yml"
 ELASTIC_APM_PROFILER_HOME="E:\ElasticStack\elastic_apm_profiler\"
 ELASTIC_APM_PROFILER_LOG_DIR="E:\ElasticStack\elastic_apm_profiler\logs"
 ELASTIC_APM_PROFILER_LOG_IL="1"
 ELASTIC_APM_PROFILER_LOG="trace"
 COR_PROFILER="{FA65FE15-F085-4681-9B20-95E04F6C03CC}"
 COR_ENABLE_PROFILING="1"
[2022-02-04T07:47:26.108434900+01:00] [TRACE] loaded 9 integration(s) from E:\ElasticStack\elastic_apm_profiler\integrations.yml
[2022-02-04T07:47:26.108497700+01:00] [DEBUG] Problem reading ELASTIC_APM_PROFILER_CALLTARGET_ENABLED: environment variable not found. Setting to true
[2022-02-04T07:47:26.108736000+01:00] [DEBUG] Initialize: loaded 123 integration(s)
[2022-02-04T07:47:26.108782500+01:00] [INFO ] Initialize: CallTarget instrumentation is enabled
[2022-02-04T07:47:26.108795300+01:00] [DEBUG] Problem reading ELASTIC_APM_PROFILER_ENABLE_INLINING: environment variable not found. Setting to true
[2022-02-04T07:47:26.108802700+01:00] [INFO ] Initialize: JIT Inlining is enabled
[2022-02-04T07:47:26.108828500+01:00] [DEBUG] Problem reading ELASTIC_APM_PROFILER_DISABLE_OPTIMIZATIONS: environment variable not found. Setting to false
[2022-02-04T07:47:26.108836500+01:00] [TRACE] Initialize: set event mask2 to COR_PRF_MONITOR_MODULE_LOADS | COR_PRF_MONITOR_ASSEMBLY_LOADS | COR_PRF_MONITOR_APPDOMAIN_LOADS | COR_PRF_MONITOR_JIT_COMPILATION | COR_PRF_ENABLE_REJIT | COR_PRF_DISABLE_TRANSPARENCY_CHECKS_UNDER_FULL_TRUST | COR_PRF_DISABLE_ALL_NGEN_IMAGES, COR_PRF_HIGH_ADD_ASSEMBLY_REFERENCES

Something I found suspicious, but need your help to identify as part of the standard tracing or indeed an exception of the profiler which explains the issue?

Message found: "Pre_application_start_init_method_threw_exception"

[2022-02-04T07:47:29.387208600+01:00] [DEBUG] calltarget_rewriter_callback: start System.Web.Compilation.BuildManager.InvokePreStartInitMethodsCore() [is_void=true, is_static=true, integration_type=Elastic.Apm.Profiler.Managed.Integrations.AspNet.ElasticApmModuleIntegration, arguments=2]
[2022-02-04T07:47:29.387654600+01:00] [TRACE] caller type.id: 100688388
caller type.is_generic: false
caller type.name: System.Web.Compilation.BuildManager
caller type.token_type: mdtTypeDef
caller type.spec: 452984832
caller type.is_value_type: false
caller type extend_from.id: 16777252
caller type extend_from.is_generic: false
caller type extend_from.name: System.Object
caller type extend_from.token_type: mdtTypeRef
caller type extend_from.spec: 452984832
caller type extend_from.is_value_type: false
[2022-02-04T07:47:29.387993800+01:00] [DEBUG] IL original code for caller: System.Web.Compilation.BuildManager.InvokePreStartInitMethodsCore => (max_stack: 5)
. Local Var Signature 0706151280ad011280c515128311011280c51280c51281e51287250e

  1808621858064    ldarg.0
  1808621858065       call 721420526  | System.Linq.Enumerable.Distinct(1 argument{s})
  1808621858070    stloc.0
  1808621858071    ldloc.0
  1808621858072     ldsfld 67124132
  1808621858077        dup
  1808621858078   brtrue.s 1808621858103
  1808621858080        pop
  1808621858081     ldsfld 67124130
  1808621858086      ldftn
  1808621858092     newobj 167776080  | System.Func`2..ctor(2 argument{s})
  1808621858097        dup
  1808621858098     stsfld 67124132
  1808621858103       call 167772752  | System.StringComparer.get_OrdinalIgnoreCase()
  1808621858108       call 721420527  | System.Linq.Enumerable.OrderBy(3 argument{s})
  1808621858113     ldsfld 67124133
  1808621858118        dup
  1808621858119   brtrue.s 1808621858144
  1808621858121        pop
  1808621858122     ldsfld 67124130
  1808621858127      ldftn
  1808621858133     newobj 167776080  | System.Func`2..ctor(2 argument{s})
  1808621858138        dup
  1808621858139     stsfld 67124133
  1808621858144       call 167772752  | System.StringComparer.get_OrdinalIgnoreCase()
  1808621858149       call 721420528  | System.Linq.Enumerable.ThenBy(3 argument{s})
  1808621858154    stloc.0
  1808621858155    ldloc.0
  1808621858156   callvirt 167776088  | System.Collections.Generic.IEnumerable`1.GetEnumerator()
  1808621858161    stloc.1
  .try {
    1808621858162       br.s 1808621858290
    1808621858164    ldloc.1
    1808621858165   callvirt 167776089  | System.Collections.Generic.IEnumerator`1.get_Current()
    1808621858170    stloc.2
    .try {
      1808621858171    ldarg.1
      1808621858172   callvirt 167776090  | System.Func`1.Invoke()
      1808621858177    stloc.3
      .try {
        1808621858178    ldloc.2
        1808621858179     ldnull
        1808621858180     ldnull
        1808621858181   callvirt 167772492  | System.Reflection.MethodBase.Invoke(2 argument{s})
        1808621858186        pop
        1808621858187    leave.s 1808621858199
      }
      .finally {
        1808621858189    ldloc.3
        1808621858190  brfalse.s 1808621858198
        1808621858192    ldloc.3
        1808621858193   callvirt 167772215  | System.IDisposable.Dispose()
        1808621858198 endfinally
      }
      1808621858199    leave.s 1808621858290
    }
    .catch {
      1808621858201    stloc.s 4
      1808621858203    ldloc.s 4
      1808621858205   callvirt 167772660  | System.Exception.get_InnerException()
      1808621858210   brtrue.s 1808621858219
      1808621858212     ldsfld 167772517
      1808621858217       br.s 1808621858231
      1808621858219    ldloc.s 4
      1808621858221   callvirt 167772660  | System.Exception.get_InnerException()
      1808621858226   callvirt 167772651  | System.Exception.get_Message()
      1808621858231    stloc.s 5
      1808621858233      ldstr 1879353730  | "Pre_application_start_init_method_threw_exception"
      1808621858238   ldc.i4.3
      1808621858239     newarr 16777252  | System.Object
      1808621858244        dup
      1808621858245   ldc.i4.0
      1808621858246    ldloc.2
      1808621858247   callvirt 167772681  | System.Reflection.MemberInfo.get_Name()
      1808621858252 stelem.ref
      1808621858253        dup
      1808621858254   ldc.i4.1
      1808621858255    ldloc.2
      1808621858256   callvirt 167772487  | System.Reflection.MemberInfo.get_DeclaringType()
      1808621858261   callvirt 167772443  | System.Type.get_FullName()
      1808621858266 stelem.ref
      1808621858267        dup
      1808621858268   ldc.i4.2
      1808621858269    ldloc.s 5
      1808621858271 stelem.ref
      1808621858272       call 100667721  | System.Web.SR.GetString(2 argument{s})
      1808621858277    ldloc.s 4
      1808621858279   callvirt 167772660  | System.Exception.get_InnerException()
      1808621858284     newobj 167774934  | System.InvalidOperationException..ctor(2 argument{s})
      1808621858289      throw
    }
    1808621858290    ldloc.1
    1808621858291   callvirt 167772264  | System.Collections.IEnumerator.MoveNext()
    1808621858296     brtrue 1808621858164
    1808621858301    leave.s 1808621858313
  }
  .finally {
    1808621858303    ldloc.1
    1808621858304  brfalse.s 1808621858312
    1808621858306    ldloc.1
    1808621858307   callvirt 167772215  | System.IDisposable.Dispose()
    1808621858312 endfinally
  }
  1808621858313        ret

IL modification for caller: System.Web.Compilation.BuildManager.InvokePreStartInitMethodsCore => (max_stack: 17)
. Local Var Signature 0709151280ad011280c515128311011280c51280c51281e51287250e1281d911911111910d

  .try {
    .try {
      .try {
        1808621858064       call 167776636  | Elastic.Apm.Profiler.Managed.CallTarget.CallTargetReturn.GetDefault()
        1808621858069    stloc.s 7
        1808621858071     ldnull
        1808621858072    stloc.s 6
        1808621858074     ldnull
        1808621858075    ldarg.0
        1808621858076    ldarg.1
        1808621858077       call 721420600  | Elastic.Apm.Profiler.Managed.CallTarget.CallTargetInvoker.BeginMethod(3 argument{s})
        1808621858082    stloc.s 8
        1808621858084    leave.s 1808621858093
      }
      .catch {
        1808621858086       call 721420601  | Elastic.Apm.Profiler.Managed.CallTarget.CallTargetInvoker.LogException(1 argument{s})
        1808621858091    leave.s 1808621858093
      }
      1808621858093    ldarg.0
      1808621858094       call 721420526  | System.Linq.Enumerable.Distinct(1 argument{s})
      1808621858099    stloc.0
      1808621858100    ldloc.0
      1808621858101     ldsfld 67124132
      1808621858106        dup
      1808621858107   brtrue.s 1808621858132
      1808621858109        pop
      1808621858110     ldsfld 67124130
      1808621858115      ldftn
      1808621858121     newobj 167776080  | System.Func`2..ctor(2 argument{s})
      1808621858126        dup
      1808621858127     stsfld 67124132
      1808621858132       call 167772752  | System.StringComparer.get_OrdinalIgnoreCase()
      1808621858137       call 721420527  | System.Linq.Enumerable.OrderBy(3 argument{s})
      1808621858142     ldsfld 67124133
      1808621858147        dup
      1808621858148   brtrue.s 1808621858173
      1808621858150        pop
      1808621858151     ldsfld 67124130
      1808621858156      ldftn
      1808621858162     newobj 167776080  | System.Func`2..ctor(2 argument{s})
      1808621858167        dup
      1808621858168     stsfld 67124133
      1808621858173       call 167772752  | System.StringComparer.get_OrdinalIgnoreCase()
      1808621858178       call 721420528  | System.Linq.Enumerable.ThenBy(3 argument{s})
      1808621858183    stloc.0
      1808621858184    ldloc.0
      1808621858185   callvirt 167776088  | System.Collections.Generic.IEnumerable`1.GetEnumerator()
      1808621858190    stloc.1
      .try {
        1808621858191       br.s 1808621858319
        1808621858193    ldloc.1
        1808621858194   callvirt 167776089  | System.Collections.Generic.IEnumerator`1.get_Current()
        1808621858199    stloc.2
        .try {
          1808621858200    ldarg.1
          1808621858201   callvirt 167776090  | System.Func`1.Invoke()
          1808621858206    stloc.3
          .try {
            1808621858207    ldloc.2
            1808621858208     ldnull
            1808621858209     ldnull
            1808621858210   callvirt 167772492  | System.Reflection.MethodBase.Invoke(2 argument{s})
            1808621858215        pop
            1808621858216    leave.s 1808621858228
          }
          .finally {
            1808621858218    ldloc.3
            1808621858219  brfalse.s 1808621858227
            1808621858221    ldloc.3
            1808621858222   callvirt 167772215  | System.IDisposable.Dispose()
            1808621858227 endfinally
          }
          1808621858228    leave.s 1808621858319
        }
        .catch {
          1808621858230    stloc.s 4
          1808621858232    ldloc.s 4
          1808621858234   callvirt 167772660  | System.Exception.get_InnerException()
          1808621858239   brtrue.s 1808621858248
          1808621858241     ldsfld 167772517
          1808621858246       br.s 1808621858260
          1808621858248    ldloc.s 4
          1808621858250   callvirt 167772660  | System.Exception.get_InnerException()
          1808621858255   callvirt 167772651  | System.Exception.get_Message()
          1808621858260    stloc.s 5
          1808621858262      ldstr 1879353730  | "Pre_application_start_init_method_threw_exception"
          1808621858267   ldc.i4.3
          1808621858268     newarr 16777252  | System.Object
          1808621858273        dup
          1808621858274   ldc.i4.0
          1808621858275    ldloc.2
          1808621858276   callvirt 167772681  | System.Reflection.MemberInfo.get_Name()
          1808621858281 stelem.ref
          1808621858282        dup
          1808621858283   ldc.i4.1
          1808621858284    ldloc.2
          1808621858285   callvirt 167772487  | System.Reflection.MemberInfo.get_DeclaringType()
          1808621858290   callvirt 167772443  | System.Type.get_FullName()
          1808621858295 stelem.ref
          1808621858296        dup
          1808621858297   ldc.i4.2
          1808621858298    ldloc.s 5
          1808621858300 stelem.ref
          1808621858301       call 100667721  | System.Web.SR.GetString(2 argument{s})
          1808621858306    ldloc.s 4
          1808621858308   callvirt 167772660  | System.Exception.get_InnerException()
          1808621858313     newobj 167774934  | System.InvalidOperationException..ctor(2 argument{s})
          1808621858318      throw
        }
        1808621858319    ldloc.1
        1808621858320   callvirt 167772264  | System.Collections.IEnumerator.MoveNext()
        1808621858325     brtrue 1808621858193
        1808621858330    leave.s 1808621858342
      }
      .finally {
        1808621858332    ldloc.1
        1808621858333  brfalse.s 1808621858341
        1808621858335    ldloc.1
        1808621858336   callvirt 167772215  | System.IDisposable.Dispose()
        1808621858341 endfinally
      }
      1808621858342    leave.s 1808621858370
    }
    .catch {
      1808621858344    stloc.s 6
      1808621858346    rethrow
    }
  }
  .finally {
    .try {
      1808621858348     ldnull
      1808621858349    ldloc.s 6
      1808621858351    ldloc.s 8
      1808621858353       call 721420602  | Elastic.Apm.Profiler.Managed.CallTarget.CallTargetInvoker.EndMethod(3 argument{s})
      1808621858358    stloc.s 7
      1808621858360    leave.s 1808621858369
    }
    .catch {
      1808621858362       call 721420601  | Elastic.Apm.Profiler.Managed.CallTarget.CallTargetInvoker.LogException(1 argument{s})
      1808621858367    leave.s 1808621858369
    }
    1808621858369 endfinally
  }
  1808621858370        ret

[2022-02-04T07:47:29.388057100+01:00] [INFO ] calltarget_rewriter_callback: finished System.Web.Compilation.BuildManager.InvokePreStartInitMethodsCore() [is_void=true, is_static=true, integration_type=Elastic.Apm.Profiler.Managed.Integrations.AspNet.ElasticApmModuleIntegration, arguments=2]

That's not an issue; it's an IL ldstr opcode that pushes a new object reference to a string literal stored in module metadata, onto the stack. It's part of the IL instructions of the original System.Web.Compilation.BuildManager.InvokePreStartInitMethodsCore method

specifically the constant SR.Pre_application_start_init_method_threw_exception on line 907. This method is rewritten for the ASP.NET integration. I wouldn't worry about logging out the IL instructions of rewritten methods with the ELASTIC_APM_PROFILER_LOG_IL="1" environment variable, unless you see an InvalidProgramException i.e. rewritten IL is incorrect in some way- it'll be just added noise in the logs.

The <appSettings> section shown looks fine, but the <modules> section isn't shown, or the HttpApplication methods like Application_Start(), so I can't advise if there is something there that would instantiate the agent. Are you able to share those?

Hi Russ,

We are getting that information.

In the mean time we tried the following:

  1. Removed the profiler registration with appcmd
  2. restarted IIS
  3. Started the application which has the web.config trace log configuration for the agent.
  4. After starting the application we saw that no agent log nor trace logs are being generated.

This led me to believe that the only instantation is being done by the profiler or is this a wrong conclusion?

What you're saying seems to imply that. If you reconfigure the profiler and restart IIS, do you get any traces? If not, are you able to attach a debugger and inspect the properties of the instantiated agent?

After reconfiguring the profiler and restarting we get only one singleton message in the agent log. Trace logs are present..but still nothing besides metrics in APM server..

We will install VS on the server and attach debugger..hopefully more information soon.

We have some trouble debugging, if it's possible to set up a meeting to have a look with us it would be great.

What we did figure out is that in our case the IIS has 1 website with every application as a virtual directory. We noticed that each initialization of an APP triggers the Singleton message. Is that as designed or part of the problem?

@forloop We found the following issue while debugging the agent:

I couldn't find anything about administrative rights for the installation besides that the app pool owner needs execute rights on the profiler folder.

Potentially- I don't think this is a scenario that has been thoroughly tested so far.

The best way to determine whether having multiple applications in virtual directories under one website might be affecting things would be to remove all virtual directories except for one, where the remaining one is configured to run with the profiler. If that works as expected, add another app in a virtual directory and see if the issue arises.

ETW is used by the APM agent on Windows for .NET Framework, to collect GC metrics

If there's an issue in starting a trace event session, the exception is caught, and GC metrics won't be reported.

It looks like the UnauthorizedAccessException is caught? Is the exception coming from the agent when attempting to collect GC metrics? It's not super clear from the screenshot if it's coming from the agent or something else that might be using ETW.

We did quite a lot of testing and I can share the following, hope it rings a bell:

  1. ApplicationPool with 1 application worked like a charm. We immediately received transactions
  2. We created a new application pool and moved one of the applications from the shared pool to it. It fails to send transactions like with the shared pool

We inspected the differences between the already isolated application in its own pool and the one we moved to an isolated pool.

To give a bit more context, the application giving transactions is a platform (low-code) application built by the supplier. It is a standalone application with its own binaries

Our applications are built/generated by the low-code platform which generates a virtual directory per module that we created. These modules do not have isolated binaries but have symbolic links to a platform repository where all the application DLLs are stored. This is how the platform handles references between modules etc..nothing we can do about it.

Both applications are the same .net/asp.net code, same settings in IIS, the only difference is that applications built by the platform use symbolic links.

Getting APM to work for the applications our customers built is crucial for us. Therefore I wouldn't mind a collaboration with the elastic team to see if we can figure out why this scenario doesn't work. I know it can work since a big competitor already proved that. Would this be possible? I have the environment all ready for it.

I believe this came from the agent but can't say with certainty, was quite a challenge to get the debugger going. Placing the apppool owner in Administrator group solved the messages. Its not the solution but I wanted to know if I was missing rights or if it was related to something else. Once we get the agent working on our application I'll focus on finding the missing rights. Placing the apppool owner in the user log/monitor groups wasn't enough.

@forloop We believe that this is why no traces are being collected.

so far it only occurs for the isolated application with symbolic links to the dlls

[2022-02-23 10:06:28.496 +01:00][Trace] - {SingleThreadTaskScheduler (thread: ElasticApmCentralConfigurationFetcher)} Finished executing task. Task: {ID: 46, Status: RanToCompletion, AsyncState: null}
[2022-02-23 10:06:28.630 +01:00][Warning] - {GcMetricsProvider} TraceEventSession initialization failed - GC metrics won't be collected+-> Exception: System.Runtime.InteropServices.COMException: Insufficient system resources exist to complete the requested service. (Exception from HRESULT: 0x800705AA)
   at System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32 errorCode, IntPtr errorInfo)
   at Microsoft.Diagnostics.Tracing.Session.TraceEventSession.InsureStarted(EVENT_TRACE_PROPERTIES* properties)
   at Microsoft.Diagnostics.Tracing.Session.TraceEventSession.get_Source()
   at Elastic.Apm.Metrics.MetricsProvider.GcMetricsProvider..ctor(IApmLogger logger, IReadOnlyList`1 disabledMetrics) in C:\Users\jenkins\workspace\net_apm-agent-dotnet-mbp_PR-1594\apm-agent-dotnet\src\Elastic.Apm\Metrics\MetricsProvider\GcMetricsProvider.cs:line 91

[2022-02-23 10:06:28.631 +01:00][Trace] - {CgroupMetricsProvider} CgroupMetricsProvider detected a non Linux OS, therefore Cgroup metrics will not be reported
[2022-02-23 10:06:28.632 +01:00][Info] - {MetricsCollector} Collecting metrics in 30000 milliseconds interval
[2022-02-23 10:06:28.647 +01:00][Debug] - {ElasticApmModule.#1} Found ASP.NET version: 4.8.4465.0
[2022-02-23 10:06:28.649 +01:00][Error] - The singleton APM agent has already been instantiated and can no longer be configured. Reusing existing instance
[2022-02-23 10:06:28.649 +01:00][Trace] - Initialization - Agent.Setup called. Callstack:    at Elastic.Apm.Agent.Setup(AgentComponents agentComponents)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.<>c__DisplayClass25_0.<InitOnceForAllInstancesUnderLock>b__0()
   at Elastic.Apm.Helpers.LazyContextualInit.<>c__DisplayClass6_0.<Init>b__0()
   at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func`1 valueFactory)
   at Elastic.Apm.Helpers.LazyContextualInit.Init(Action initAction)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.InitOnceForAllInstancesUnderLock(String dbgInstanceName)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.InitImpl(HttpApplication application)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.Init(HttpApplication application)
   at System.Web.HttpApplication.RegisterEventSubscriptionsWithIIS(IntPtr appContext, HttpContext context, MethodInfo[] handlers)
   at System.Web.HttpApplication.InitSpecial(HttpApplicationState state, MethodInfo[] handlers, IntPtr appContext, HttpContext context)
   at System.Web.HttpApplicationFactory.GetSpecialApplicationInstance(IntPtr appContext, HttpContext context)
   at System.Web.Hosting.PipelineRuntime.InitializeApplication(IntPtr appContext)

It breaks on .NeedLoadedDotNetRuntimes();

We ran out of event listeners (somehow). We manually removed them and do not see the issue about resources coming back. We will try to find out why caused this, but it seems not to be the main issue.

The failure to set up an ETW trace session should not be an issue for agent startup, but it does mean you wouldn't get GC metrics. You can disable GC metrics with

<add key="ElasticApm:DisableMetrics" value="clr.gc.*" />

which won't attempt to create an ETW trace session.

From the APM agent log posted, it looks like there is still an attempt to instantiate the singleton instance of the agent when it has already been instantiated

[2022-02-23 10:06:28.649 +01:00][Error] - The singleton APM agent has already been instantiated and can no longer be configured. Reusing existing instance
[2022-02-23 10:06:28.649 +01:00][Trace] - Initialization - Agent.Setup called. Callstack:    at Elastic.Apm.Agent.Setup(AgentComponents agentComponents)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.<>c__DisplayClass25_0.<InitOnceForAllInstancesUnderLock>b__0()
   at Elastic.Apm.Helpers.LazyContextualInit.<>c__DisplayClass6_0.<Init>b__0()
   at System.Threading.LazyInitializer.EnsureInitializedCore[T](T& target, Boolean& initialized, Object& syncLock, Func`1 valueFactory)
   at Elastic.Apm.Helpers.LazyContextualInit.Init(Action initAction)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.InitOnceForAllInstancesUnderLock(String dbgInstanceName)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.InitImpl(HttpApplication application)
   at Elastic.Apm.AspNetFullFramework.ElasticApmModule.Init(HttpApplication application)
   at System.Web.HttpApplication.RegisterEventSubscriptionsWithIIS(IntPtr appContext, HttpContext context, MethodInfo[] handlers)
   at System.Web.HttpApplication.InitSpecial(HttpApplicationState state, MethodInfo[] handlers, IntPtr appContext, HttpContext context)
   at System.Web.HttpApplicationFactory.GetSpecialApplicationInstance(IntPtr appContext, HttpContext context)
   at System.Web.Hosting.PipelineRuntime.InitializeApplication(IntPtr appContext)

I'm fairly sure this is the crux of why we're not seeing traces.

Something worth trying is to disable the ASP.NET integration in the profiler, and add the ElasticApmModule manually in web.config, by referencing the NuGet Gallery | Elastic.Apm.AspNetFullFramework 1.14.0 package. The integration can be disabled with the environment variable

ELASTIC_APM_PROFILER_EXCLUDE_INTEGRATIONS=AspNet

added to the app pool. You'll need to recycle the app pool for changes to take effect.

@forloop We managed to figure out why we receive no traces..You are right that it is related to the Agent already being instantiated, but not in a way we were expecting. Here's what we found out..:

  1. Thread exceptions are due to app pool recycles (consistent reproduced). So those are not related to our issue
  2. Traces are being captured. During debug we saw that traces are being captured and logs are written. However, we couldn't find any trace of it. No logs, no data received. But profiler, agent is active
  3. We finally managed to caught the very first initialization of the Agent. The first init is done with an empty serverurl, which made it fall back on localhost:8200. Then there is a second attempt to start the agent and then we see our ServerUrl being passed to the Agent init, but it's too late due to the Singleton already being active. That's why we see no logs or receive data.

So far we haven't figured out why the agent starts with no configuration for the first Dll from our app that instantiates it. Any ideas about this behaviour?

The weird thing is that we do get Mectrics, so how is that possible since the Agent instantiated with localhost instead of our elastic url??

@forloop Update: figured out where the first agent init comes from. It's from the profiler and that is not using the web.config. I added the environment variables to the profiler powershell setup script and am now seeing the new service being added to APM server and traces