Distributed tracing technologies allow developers to virtually glue together disparate services to build a cohesive transaction that can be observed by folks in the operations team. This is super important because the distributed nature of modern cloud-native applications only increases metrics like MTTD (Mean Time To Detect) and the MTTR (Mean Time To Resolve) when issues happen.
Though tracing technologies are not necessarily a new concept—only in recent years it gained enough traction to become one of the dimensions required to effectively monitor applications. Part of this traction is the result of multiple standards such as OpenTracing and OpenCensus being created to speed up developer adoption. However, it doesn't make any sense having different standards since this creates segregation rather than standardization. For this reason the standard OpenTelemetry was created out of the existing ones to be an observability framework for cloud-native software.
In this post, I will walk you through how to instrument applications written in Go to emit traces compatible with the OpenTelemetry specification, as well as how to send these traces to Elastic APM.
Overview of the OpenTelemetry Architecture
Before getting into the nitty-gritty details of how to instrument applications, it is important to understand the architecture behind OpenTelemetry. The main building blocks are receivers, exporters and collectors. Receivers allow a given layer of the architecture to receive data using a protocol. Multiple protocols are available to be used but the most common one will be OTLP. Exporters function as an API that encapsulates the details of how to send data to a given backend. Collectors leverage a set of receivers and exporters to implement a processing pipeline—such as preparing the data to be sent to one or multiple destinations. The diagram below depicts this architecture.
As you can see in the diagram, the instrumented application has at least one of the building blocks which are the exporters because it needs to be able to send the generated traces to the collector, hence why they use exporters. The collector sits between the instrumented application and the telemetry backend—which for the purposes of this post will be Elastic APM. The collector needs to use a receiver to be able to accept the traces sent from the instrumented application. Similarly, the collector doesn't know how to send data to Elastic APM so it uses an exporter that knows how to talk with Elastic APM so it makes transparent to the collector how to deal with this technology.
By using this approach focused on receivers, exporters, and collectors the OpenTelemetry architecture allows users to mix-and-match different protocols and technologies which in turn grants them flexibility to choose different vendors without sacrificing compatibility. This means that you can write your application to emit traces compatible with OpenTelemetry while using a given telemetry backend during development but switch to another (hopefully Elastic APM) on their way out to production.
Implementing a Collector
The collector plays a key role in the architecture because it acts as the middleware that decouples the instrumented application and the telemetry backend. Though you can build your own collector based on the OpenTelemetry specification—you don’t have to. There is a reusable implementation of a collector written in Go that suits most use cases. There is also a contributions version that has code from multiple open source projects such as Elastic APM. In this post we will use the latter.
Everything is configured using YAML. For this reason, create a YAML file with the following content:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:55680
extensions:
health_check:
exporters:
elastic:
apm_server_url: "http://apm-server:8200"
service:
extensions: [health_check]
pipelines:
traces:
receivers: [otlp]
exporters: [elastic]
A receiver for the OTLP protocol is set which expects to receive data using any network interface over the port 55680
. This is the endpoint where the instrumented application will send the traces. An extension is also set to provide a way for any upstream monitoring layer to do health checks in the collector. By default health checks are executed over HTTP using the port 13133
. Finally, the collector uses the built-in exporter for Elastic APM to send the data to the endpoint http://apm-server:8200
. This is the endpoint that Elastic APM should be exposed for this to work.
Once the code for the collector is in place you can spin up a new collector instance and pass this code as a parameter. For example, let's say that you save this code in a file named collector-config-local.yaml
. Therefore, you can spin up the collector using Docker Compose like this:
collector:
image: otel/opentelemetry-collector-contrib-dev
container_name: collector
hostname: collector
command: ["--config=/etc/collector-config-local.yaml"]
volumes:
- ./collector-config-local.yaml:/etc/collector-config-local.yaml
ports:
- "13133:13133"
- "55680:55680"
networks:
- any_shared_network
depends_on:
apm-server:
condition: service_healthy
Note that the collector rebinds the ports 13133
and 55680
to the host machine. The port 13133
can be used for health check purposes and the port 55680
should be used by the instrumented application to send the traces.
Instrumenting the Go application
Now that the collector is up-and-running we can instrument the application. The first step is to add the OpenTelemetry packages required for the Go agent:
go.opentelemetry.io/otel
go.opentelemetry.io/otel/exporters/otlp
go.opentelemetry.io/otel/sdk
This example creates an HTTP API written in Go that uses the gorilla/mux framework thus import the following packages as well:
github.com/gorilla/mux
go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/otelmux
Now let's suppose that the endpoint of the collector is passed as an environment variable named COLLECTOR_ADDRESS
. The value of this variable needs to follow the pattern hostname:port
. In this case, the first step is to instantiate an exporter so the application knows how to send data to the collector:
collectorAddress := os.Getenv("COLLECTOR_ADDRESS")
exporter, err := otlp.NewExporter(
otlp.WithInsecure(),
otlp.WithAddress(collectorAddress))
if err != nil {
log.Fatalf("Error creating the collector: %v", err)
}
To improve the communication of the application with the collector we can batch the data before sending it out. To implement this behavior you need to instantiate a batch span processor:
bsp := sdktrace.NewBatchSpanProcessor(exporter)
defer bsp.Shutdown()
Tracing is all about communication so you have to describe the service accordingly. For this reason you can use labels in your code to identify your service:
res := resource.New(
semconv.ServiceNameKey.String("hello-app"),
semconv.ServiceVersionKey.String("1.0"),
semconv.TelemetrySDKNameKey.String("opentelemetry"),
semconv.TelemetrySDKLanguageKey.String("go"),
semconv.TelemetrySDKVersionKey.String("0.13.0"))
Finally instantiate a tracer provider and made it available throughout the entire application:
tracerProvider := sdktrace.NewTracerProvider(
sdktrace.WithSpanProcessor(bsp),
sdktrace.WithResource(res))
global.SetTracerProvider(tracerProvider)
global.SetTextMapPropagator(otel.NewCompositeTextMapPropagator(
propagators.TraceContext{}, propagators.Baggage{}))
Note that along with the tracer provider we also set a text map propagator. This is important because we want to automatically propagate the tracing data in the carrier layer of the application which in this case is HTTP.
So far the written code makes the tracer provider available in the application. This is required only during application bootstrapping. After this you can execute the code that your application needs to perform its functions, such as registering which HTTP endpoints will be exposed to users.
router := mux.NewRouter()
router.Use(otelmux.Middleware("hello-app"))
router.HandleFunc("/hello", hello)
http.ListenAndServe(":8888", router)
Note that we wrapped the router with the OpenTelemetry middleware. Internally the middleware knows how to get the tracer provider and build its own tracers to create spans.
Creating Custom Spans
You don't have to rely only on the spans created by the instrumentation libraries. You can use the OpenTelemetry API in your code to create your own spans. The code below creates a span called custom-span
that contains an attribute named custom-label
whose value is set to Gopher
.
func hello(writer http.ResponseWriter, request *http.Request) {
ctx := request.Context()
_, customSpan := tracer.Start(ctx, "custom-span",
trace.WithAttributes(
label.String("custom-label", "Gopher")))
customSpan.End()
response := Response{"Hello World"}
bytes, _ := json.Marshal(response)
writer.Header().Add("Content-Type",
"application/json")
writer.Write(bytes)
}
Since this span is created in the function that implements an HTTP handler—every time this handler is executed a child span will be created.
How this span is ultimately associated with the HTTP request? Note that before creating the span we retrieved the context from the HTTP request. The same context was passed down to the tracer package that creates the span. Behind the scenes what happens is that the span id is associated with the current span and all this information is saved automatically in a HTTP header.
Transactions on Elastic APM
So far we have been writing all this code to send traces to Elastic APM and now it is time to collect the dividends of this effort. If you go to APM on Kibana you should see the following:
More importantly, if you drill down into the transactions tab and search for a transaction named /hello
you will see all the transactions executed thus far:
Clicking on the custom span will show the details of the span—which includes the label that we created programmatically.
Cool right?
Trying out yourself
If you want to try this code but don't have the time to write it down don't worry you can find the whole code already implemented and ready to be executed in the following GitHub repository:
This repository provides two options for you:
- Execute with everything running locally on your machine
To use this option just execute Docker Compose this way:
docker-compose -f docker-compose-local.yaml up -d
- Execute with Elastic APM running on Elastic Cloud
To use this option first you need to edit the file collector-config-cloud.yaml
and provide the Elastic APM endpoint and the secret token. Then you can execute Docker Compose this way:
docker-compose -f docker-compose-cloud.yaml up -d
Summary
This post provided a brief introduction to OpenTelemetry and why it is so important for modern cloud-native applications. It explained its architecture and how to instrument applications written in Go. I would highly encourage you to further explore OpenTelemetry in conjunction with Elastic APM that provides stellar support for the other pillars of a truly observability framework such as logs and metrics.
Content like this can be found in the Elastic Community channel on YouTube.