Only common trace ID for sampled spans

Hi,

We have a bunch of micro services implemented in Go that talk to each other over gRPC.

Recently we implemented logging of the trace ID in our services in order to be able to trace logs across services. We use the Zap logger and extract the APM information to create a new logger as described here.

However, it seems like the trace ID only gets sent from one service to the next if the current span is sampled. If the span isn't sampled we get a (seemingly) random trace ID and we're unable to trace the request across services.

We instrument our clients with the interceptor like this:

conn, err := grpc.Dial(addr, grpc.WithInsecure(), grpc.WithUnaryInterceptor(apmgrpc.NewUnaryClientInterceptor()))

Looking at the interceptor's source code we can see that the metadata doesn't get added if the span is dropped:

func startSpan(ctx context.Context, name string) (*apm.Span, context.Context) {
	span, ctx := apm.StartSpan(ctx, name, "external.grpc")
	if span.Dropped() {
		return span, ctx
	}
	traceparentValue := apmhttp.FormatTraceparentHeader(span.TraceContext())
	md, ok := metadata.FromOutgoingContext(ctx)
	if !ok {
		md = metadata.Pairs(traceparentHeader, traceparentValue)
	} else {
		md = md.Copy()
		md.Set(traceparentHeader, traceparentValue)
	}
	return span, metadata.NewOutgoingContext(ctx, md)
}

This means that for a sample rate lower than 1.0 tracing doesn't really work as intended.

Now the questions are:

  • are we doing anything wrong?
  • have we misunderstood how distributed tracing works?
  • is this a bug in the module?

We're using version 1.4.0 of the APM gRPC module.

Hi @Ricco_Forgaard,

Welcome to the forum, and thanks for raising your question here. You have indeed found a bug in the apmgrpc module. The client interceptor should behave similarly to the apmhttp client instrumentation, including the transaction's trace context if the span is dropped.

I have opened https://github.com/elastic/apm-agent-go/issues/601 to get this fixed.

Thanks @axw, good to know.

In case you aren't watching the issue: we've just released v1.5.0 of the Go agent which includes a fix for this. Thanks again for bringing it to our attention.

I did, thanks for responding so quickly! Highly appreciated.