tselofan
(Алексей Целиковский)
August 20, 2024, 7:08am
1
We use prometheus-net.DotNetMetrics for system monitoring in ASP .net6 application. And when we started using Elastic Apm agent for .net , the dotnet_exceptions_total
metric began to show increased values for System.NotSupportedException
exceptions. We tried v1.18.0 and v 1.28.4 agent versions.
dotnet-trace shows, that the reason is in DataFlow (Task Parallel library). Exceptions have message:
This member is not supported on this dataflow block. The block is
intended for a specific purpose that does not utilize this member.
These exceptions are catched in DataFlow, but create noise in the metric and can affect performance.
Source code, where exception is cached:
Debug.Assert(cancellationToken.IsCancellationRequested,
"The task will only be immediately canceled if the token has cancellation requested already.");
var t = new Task<TResult>(CachedGenericDelegates<TResult>.DefaultTResultFunc, cancellationToken);
Debug.Assert(t.IsCanceled, "Task's constructor should cancel the task synchronously in the ctor.");
return t;
}
/// <summary>Gets the completion task of a block, and protects against common cases of the completion task not being implemented or supported.</summary>
/// <param name="block">The block.</param>
/// <returns>The completion task, or null if the block's completion task is not implemented or supported.</returns>
internal static Task? GetPotentiallyNotSupportedCompletionTask(IDataflowBlock block)
{
Debug.Assert(block != null, "We need a block from which to retrieve a cancellation task.");
try
{
return block.Completion;
}
catch (NotImplementedException) { }
catch (NotSupportedException) { }
return null;
}
We have this problem in kubernetes pods, but not in the local machine under the Windows.
Has anyone encountered such a problem?
Thanks for bringing this to our attention @tselofan .
We only use a plain BatchBlock
which definitely implements block.Completion
.
return _source.TryReceive(filter, out item);
}
/// <include file='XmlDocs/CommonXmlDocComments.xml' path='CommonXmlDocComments/Sources/Member[@name="TryReceiveAll"]/*' />
public bool TryReceiveAll([NotNullWhen(true)] out IList<T[]>? items) { return _source.TryReceiveAll(out items); }
/// <include file='XmlDocs/CommonXmlDocComments.xml' path='CommonXmlDocComments/Sources/Member[@name="OutputCount"]/*' />
public int OutputCount { get { return _source.OutputCount; } }
/// <include file='XmlDocs/CommonXmlDocComments.xml' path='CommonXmlDocComments/Blocks/Member[@name="Completion"]/*' />
public Task Completion { get { return _source.Completion; } }
/// <summary>Gets the size of the batches generated by this <see cref="BatchBlock{T}"/>.</summary>
/// <remarks>
/// If the number of items provided to the block is not evenly divisible by the batch size provided
/// to the block's constructor, the block's final batch may contain fewer than the requested number of items.
/// </remarks>
public int BatchSize { get { return _target.BatchSize; } }
/// <include file='XmlDocs/CommonXmlDocComments.xml' path='CommonXmlDocComments/Targets/Member[@name="OfferMessage"]/*' />
DataflowMessageStatus ITargetBlock<T>.OfferMessage(DataflowMessageHeader messageHeader, T messageValue, ISourceBlock<T>? source, bool consumeToAccept)
I had a quick peek at the differences of BatchBlock/SourceCore and the various BlockOption
between .NET and .NET core but could not spot anything obvious.
I created a channel based payload sender in this draft PR: Elastic.Apm.Ingest, new PayloadSender implementation by Mpdreamz · Pull Request #2171 · elastic/apm-agent-dotnet · GitHub
It might be worth finishing that exercise to move us of TPL in the long run.
tselofan
(Алексей Целиковский)
August 21, 2024, 4:31am
3
Thanks for your answer! My research has shown the following:
The reason is in targetBlock which is of type ReceiveTarget.
And the main problem is that the exception occurs when DataflowEtwProvider eventsource is enabled. So it tries to collect tracing data.
// Add the target to both stores, the list and the dictionary, which are used for different purposes
var node = new LinkedTargetInfo(target, linkOptions);
AddToList(node, linkOptions.Append);
_targetInformation.Add(target, node);
// Increment the optimization counter if needed
Debug.Assert(_linksWithRemainingMessages >= 0, "_linksWithRemainingMessages must be non-negative at any time.");
if (node.RemainingMessages > 0) _linksWithRemainingMessages++;
DataflowEtwProvider etwLog = DataflowEtwProvider.Log;
if (etwLog.IsEnabled())
{
etwLog.DataflowBlockLinking(_owningSource, target);
}
}
/// <summary>Gets whether the registry contains a particular target.</summary>
/// <param name="target">The target.</param>
/// <returns>true if the registry contains the target; otherwise, false.</returns>
internal bool Contains(ITargetBlock<T> target)
{
I think it is not usual behaviour and I need to disabled it. First of all I think that I need to find which EventSource enabled it.
tselofan
(Алексей Целиковский)
August 22, 2024, 4:39am
4
The reason was in prometheus-net library. In version 7.0.0 it subscribed to all eventsources in app domain.
namespace Prometheus
{
public sealed class EventCounterAdapterOptions
{
public static readonly EventCounterAdapterOptions Default = new();
/// <summary>
/// By default we subscribe to event counters from all event sources but this allows you to filter by event source name.
/// </summary>
public Func<string, bool> EventSourceFilterPredicate { get; set; } = _ => true;
/// <summary>
/// By default, we subscribe to event counters at Informational level from every event source.
/// You can customize these settings via this callback (with the event source name as the string given as input).
/// </summary>
public Func<string, EventCounterAdapterEventSourceSettings> EventSourceSettingsProvider { get; set; } = _ => new();
public CollectorRegistry Registry { get; set; } = Metrics.DefaultRegistry;
}
}
I upgraded it to version 8 and the problem went away.
/// without just enabling everything under the sky (because .NET has no way to say "enable only the event counters", you have to enable all diagnostic events).
/// </summary>
private static readonly IReadOnlyList<string> DefaultEventSourcePrefixes = new[]
{
"System.Runtime",
"Microsoft-AspNetCore",
"Microsoft.AspNetCore",
"System.Net"
};
public static readonly Func<string, bool> DefaultEventSourceFilterPredicate = name => DefaultEventSourcePrefixes.Any(x => name.StartsWith(x, StringComparison.Ordinal));
}
Thank you for your participation!
1 Like