HTTP spans doesn't show the full information about the request

Hi!
I'm using Opentelemetry + Elastic.
The problem that I'm facing is that the http requests child spans doesn't show the full information. The way those spans are shown aren't useful since doesn't indicates which span fails of with what error.


The strange thing is that if the transaction is not a child span (i.e root span) the http/url information is displayed.

Kibana version: 7.15.0

Elasticsearch version: 7.15.0

APM Server version: 7.15.0

APM Agent language and version: Golang 1.15, Opentelemetry-Go 1.1.0, Otel-Collector-Contrib 0.38.0

Browser version: Chrome 95.0.4638.54

Original install method (e.g. download page, yum, deb, from source, etc.) and version: dmg

Hi @floyola,

First of all, welcome to the Elastic community :partying_face:

I noticed in your setup; you're using the contrib version of the OTel collector. I strongly recommend changing your code to use an OTLP exporter to send data directly to Elastic APM. Since the 7.13 release, the Elastic exporter has been deprecated, and users are encouraged to send data directly to Elastic APM. More information here.

If you need an example of how to send data directly to Elastic APM using OTLP, here is one:

:arrow_right: Check the run-without-collector.yaml example.

Other than this, mind that root spans usually will be richer than child spans because they may have been created by instrumented middleware. Child spans are generally the result of custom spans created throughout your code, to which you need to properly decorate them with the correct information — whether using labels and attributes. Errors also need to be explicitly recorded using the function RecordError(err error, options ...EventOption) from the Span struct.

@riferrei

1 Like

Hi @riferrei

I'm already using the otlp exporter :sweat_smile:

exporters:
  otlp/elastic:
    endpoint: $${ELASTIC_APM_ENDPOINT}
    headers:
      Authorization: "Bearer $${ELASTIC_APM_TOKEN}"

This gist shows how I'm initializing the tracer exporter to send the data to the OTLP exporter

And I'm using the otelhttp transport to instrument the requests

client = &http.Client{
  Transport: otelhttp.NewTransport(http.DefaultTransport),
}

The weird thing is that if I don't pass the proper context then the span are created as root span and they have the complete information, but once I pass a context with the trace information to link the spans, then those spans are child spans but doesn't show the request labels.
Since the instrumentation is the same I believe that is a matter of how the APM server is displaying the data according to the span type

Even better :hugs:

I have noticed in your gist that you instantiate an OTel exporter using the HTTP implementation:

func initTracer(ctx context.Context, conf tracingConfig) error {
	var err error
	traceExporter, err = otlptracehttp.New(ctx,
		otlptracehttp.WithTLSClientConfig(
			&tls.Config{InsecureSkipVerify: true},
		),
		otlptracehttp.WithEndpoint(conf.CollectorAddress),
		otlptracehttp.WithCompression(otlptracehttp.GzipCompression),
	)
	if err != nil {
		return err
	}

I'm curious if you configured this as well in your collector configuration file? Presumably, your receiver needs to be HTTP-based to work with your current version of the code. If this is the case, could you try switching to gRPC instead?

From:

receivers:
  otlp:
    protocols:
      http:

To:

receivers:
  otlp:
    protocols:
      grpc:

PS: You would need to change your code as well :weary:

I suspect that the translation from HTTP to gRPC can be causing problems. The OTLP exporter, as configured in your previous response, is sending data over gRPC:

exporters:
  otlp/elastic:
    endpoint: $${ELASTIC_APM_ENDPOINT}
    headers:
      Authorization: "Bearer $${ELASTIC_APM_TOKEN}"

It is a ballpark test, but I wouldn't be surprised if that is the case. The collector code keeps changing very fast, and things like this are every day in the OpenTelemetry world until things become more stable.

@riferrei

I can't use gRPC due to a problem with Bazel and protobuf :frowning_face:
That's why I'm using HTTP.

So do you think that somewhere in the translation, depending of the span type some attributes could be removed? :thinking:

Btw I'm also having issues with the gRPC labels, so yeah, it could be possible.
I'll try to fix the issue with the gRPC OTLP exporter and we'll see :crossed_fingers:

Not necessarily regarding span types, but protocol translation. When the collector receives data in HTTP to eventually send it over via gRPC — it needs to parse de HTTP data, create a payload in gRPC using the compatible messages, and then transfer the data. It seems simple, but it can lose some data because it involves marshaling and unmarshalling data types. Not necessarily something that is supposed to fail, but we know that exhaustive integration tests aren't always done.

Let us know the outcome of this investigation.

Hello @riferrei

I was finally able to use gRPC to send the data to my cluster.
Unfortunately nothing change.

If the span is root the information is shown properly, but if the span is a child it doesn't :sob:

@floyola would you be able to configure opentelemetry-collector with the logging exporter, send a trace with both types of spans, and paste the logging output here? That would help us work through what might be going wrong.

This is the child span

Span #1
    Trace ID       : 7396b89fffbe8f29e1d94495fd705d1c
    Parent ID      : c9225e3759e7cf49
    ID             : f7e76e6547fcac57
    Name           : HTTP POST
    Kind           : SPAN_KIND_CLIENT
    Start time     : 2021-11-25 15:23:42.039164 +0000 UTC
    End time       : 2021-11-25 15:23:42.413244333 +0000 UTC
    Status code    : STATUS_CODE_UNSET
    Status message :
Attributes:
     -> http.method: STRING(POST)
     -> http.url: STRING(https://my.url.com/)
     -> http.request_content_length: INT(84)
     -> http.scheme: STRING(http)
     -> http.host: STRING(my.url.com)
     -> http.flavor: STRING(1.1)
     -> http.status_code: INT(200)

As root span

Span #0
    Trace ID       : ab410cb96d391ea2a5814843006e1294
    Parent ID      :
    ID             : 60277cf373e45876
    Name           : HTTP POST
    Kind           : SPAN_KIND_CLIENT
    Start time     : 2021-11-25 15:26:28.049663 +0000 UTC
    End time       : 2021-11-25 15:26:28.411667708 +0000 UTC
    Status code    : STATUS_CODE_UNSET
    Status message :
Attributes:
     -> http.method: STRING(POST)
     -> http.url: STRING(https://my.url.com/)
     -> http.request_content_length: INT(102)
     -> http.scheme: STRING(http)
     -> http.host: STRING(my.url.com)
     -> http.flavor: STRING(1.1)
     -> http.status_code: INT(200)

Thanks @floyola.

I cannot reproduce this :frowning:
I modified one of the examples in opentelemetry-go to send two spans, similar to what's in your your logs (see below). I pointed that at opentelemetry-collector-contrib, and pointed that at apm-server 7.15.0. I can see the URL and HTTP attributes in the span details:

Can you provide the source JSON for one of the child span documents in Elasticsearch?

// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// Example using OTLP exporters + collector + third-party backends. For
// information about using the exporter, see:
// https://pkg.go.dev/go.opentelemetry.io/otel/exporters/otlp?tab=doc#example-package-Insecure
package main

import (
	"context"
	"log"

	"google.golang.org/grpc"

	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
	"go.opentelemetry.io/otel/propagation"
	"go.opentelemetry.io/otel/sdk/resource"
	sdktrace "go.opentelemetry.io/otel/sdk/trace"
	semconv "go.opentelemetry.io/otel/semconv/v1.7.0"
	"go.opentelemetry.io/otel/trace"
)

// Initializes an OTLP exporter, and configures the corresponding trace and
// metric providers.
func initProvider() func() {
	ctx := context.Background()

	res, err := resource.New(ctx,
		resource.WithAttributes(
			// the service name used to display traces in backends
			semconv.ServiceNameKey.String("test-service"),
		),
	)
	handleErr(err, "failed to create resource")

	// Set up a trace exporter
	traceExporter, err := otlptracegrpc.New(ctx,
		otlptracegrpc.WithInsecure(),
		otlptracegrpc.WithEndpoint("localhost:55680"),
		otlptracegrpc.WithDialOption(grpc.WithBlock()),
	)
	handleErr(err, "failed to create trace exporter")

	// Register the trace exporter with a TracerProvider, using a batch
	// span processor to aggregate spans before export.
	bsp := sdktrace.NewBatchSpanProcessor(traceExporter)
	tracerProvider := sdktrace.NewTracerProvider(
		sdktrace.WithSampler(sdktrace.AlwaysSample()),
		sdktrace.WithResource(res),
		sdktrace.WithSpanProcessor(bsp),
	)
	otel.SetTracerProvider(tracerProvider)

	// set global propagator to tracecontext (the default is no-op).
	otel.SetTextMapPropagator(propagation.TraceContext{})

	return func() {
		// Shutdown will flush any remaining spans and shut down the exporter.
		handleErr(tracerProvider.Shutdown(ctx), "failed to shutdown TracerProvider")
	}
}

func main() {
	log.Printf("Waiting for connection...")

	shutdown := initProvider()
	defer shutdown()

	tracer := otel.Tracer("test-tracer")

	attrs := []attribute.KeyValue{
		semconv.HTTPMethodKey.String("POST"),
		semconv.HTTPURLKey.String("https://my.url.com/"),
		semconv.HTTPSchemeHTTP,
		semconv.HTTPHostKey.String("my.url.com"),
		semconv.HTTPFlavorHTTP11,
		semconv.HTTPStatusCodeKey.Int(200),
	}

	ctx, rootSpan := tracer.Start(
		context.Background(), "root",
		trace.WithSpanKind(trace.SpanKindClient),
		trace.WithAttributes(attrs...),
	)
	ctx, childSpan := tracer.Start(
		ctx, "child",
		trace.WithSpanKind(trace.SpanKindClient),
		trace.WithAttributes(attrs...),
	)
	childSpan.End()
	rootSpan.End()

	log.Printf("Done!")
}

func handleErr(err error, message string) {
	if err != nil {
		log.Fatalf("%s: %v", message, err)
	}
}

Hi @axw

Here it is

{
  "_index": "apm-7.15.0-span-000001",
  "_type": "_doc",
  "_id": "xcA9bH0B8rQ47_3ObHMi",
  "_version": 1,
  "_score": 1,
  "_source": {
    "parent": {
      "id": "63f3e2e7c2c62742"
    },
    "agent": {
      "name": "otlp",
      "version": "unknown"
    },
    "destination": {
      "address": "my.url.com",
      "port": 443
    },
    "processor": {
      "name": "transaction",
      "event": "span"
    },
    "url": {
      "original": "https://my.url.com/"
    },
    "labels": {
      "http_flavor": "1.1"
    },
    "observer": {
      "hostname": "ee32bebdb23e",
      "id": "ef9e88a6-7ae1-46e2-bfc3-70ef63b00606",
      "ephemeral_id": "a03808d2-83c3-4d5c-8ba4-75f0ca92d34c",
      "type": "apm-server",
      "version": "7.15.0",
      "version_major": 7
    },
    "trace": {
      "id": "8a78ca00f99ecc85d292f5c6fee5a6b2"
    },
    "@timestamp": "2021-11-29T15:08:22.444Z",
    "ecs": {
      "version": "1.11.0"
    },
    "service": {
      "name": "test-service"
    },
    "http": {
      "request": {
        "method": "POST"
      },
      "response": {
        "status_code": 200
      }
    },
    "event": {
      "outcome": "success"
    },
    "timestamp": {
      "us": 1638198502444966
    },
    "span": {
      "duration": {
        "us": 2
      },
      "subtype": "http",
      "destination": {
        "service": {
          "resource": "my.url.com:443",
          "name": "https://my.url.com",
          "type": "external"
        }
      },
      "name": "child",
      "http": {
        "method": "POST",
        "response": {
          "status_code": 200
        }
      },
      "http.url.original": "https://my.url.com/",
      "id": "84a6b6f38aa01f89",
      "type": "external"
    }
  },
  "fields": {
    "span.destination.service.type": [
      "external"
    ],
    "span.name": [
      "child"
    ],
    "span.http.method": [
      "POST"
    ],
    "destination.port": [
      443
    ],
    "span.http.response.status_code": [
      200
    ],
    "url.original.text": [
      "https://my.url.com/"
    ],
    "trace.id": [
      "8a78ca00f99ecc85d292f5c6fee5a6b2"
    ],
    "span.duration.us": [
      2
    ],
    "http.request.method": [
      "POST"
    ],
    "processor.event": [
      "span"
    ],
    "agent.name": [
      "otlp"
    ],
    "destination.address": [
      "my.url.com"
    ],
    "http.response.status_code": [
      200
    ],
    "event.outcome": [
      "success"
    ],
    "service.name": [
      "test-service"
    ],
    "processor.name": [
      "transaction"
    ],
    "span.subtype": [
      "http"
    ],
    "span.id": [
      "84a6b6f38aa01f89"
    ],
    "observer.version_major": [
      7
    ],
    "span.http.url.original": [
      "https://my.url.com/"
    ],
    "observer.hostname": [
      "ee32bebdb23e"
    ],
    "span.type": [
      "external"
    ],
    "observer.id": [
      "ef9e88a6-7ae1-46e2-bfc3-70ef63b00606"
    ],
    "timestamp.us": [
      1638198502444966
    ],
    "url.original": [
      "https://my.url.com/"
    ],
    "@timestamp": [
      "2021-11-29T15:08:22.444Z"
    ],
    "observer.ephemeral_id": [
      "a03808d2-83c3-4d5c-8ba4-75f0ca92d34c"
    ],
    "observer.version": [
      "7.15.0"
    ],
    "ecs.version": [
      "1.11.0"
    ],
    "observer.type": [
      "apm-server"
    ],
    "parent.id": [
      "63f3e2e7c2c62742"
    ],
    "labels.http_flavor": [
      "1.1"
    ],
    "agent.version": [
      "unknown"
    ],
    "span.destination.service.name": [
      "https://my.url.com"
    ],
    "span.destination.service.resource": [
      "my.url.com:443"
    ]
  }
}

I ran the snippet you paste above over a new APM server (using docker) and the result was the same :sob:

Root:

Child:

Thanks! The doc looks good. It occurs to me now that when I tested, I tested with APM Server 7.15.0 but a more recent version of Kibana. I went back and tested with Kibana 7.15.0, and I see what you're seeing now.

The good news is that with newer versions of Kibana you should see all of the details. This issue was addressed by [APM] Display all properties by default in flyout by dgieselaar · Pull Request #113221 · elastic/kibana · GitHub, which will go into 7.16.0.

1 Like

Awesome!
I'll wait then.

Thank you so much @axw @riferrei

2 Likes