Off load protobuf payload to Kafka

anoobagain · January 17, 2025, 1:58am

Problem
Our company wants to manage traces using APM, but the flow for publishing traces is a bit different. Currently, we use OpenTelemetry to capture traces for our application, but we don't want to use a collector. Instead, we want to publish the trace to Kafka and then use Logstash to push the trace to the APM server using the OpenTelemetry intake API (v1/traces). To achieve this, I want to capture the payload of OTLP. Below is the exporter I have developed to get the buffer:

public final class APMExporter implements SpanExporter {
    private final Logger logger = LoggerFactory.getLogger(APMExporter.class);

    public CompletableResultCode export(Collection<SpanData> spans) {
        Marshaler marshaler = null;
        if (memoryMode == MemoryMode.REUSABLE_DATA) {
            LowAllocationTraceRequestMarshaler lowAllocationMarshaler = marshalerPool.poll();
            if (lowAllocationMarshaler == null) {
                lowAllocationMarshaler = new LowAllocationTraceRequestMarshaler();
            }
            lowAllocationMarshaler.initialize(spans);
            marshaler = lowAllocationMarshaler;
        } else {
            marshaler = TraceRequestMarshaler.create(spans);
        }

        try {
            String bodyRequest = writeToBinary(marshaler);
            if (marshaler instanceof LowAllocationTraceRequestMarshaler) {
                ((LowAllocationTraceRequestMarshaler) marshaler).reset();
                marshalerPool.add((LowAllocationTraceRequestMarshaler) marshaler);
            }
            apmServer.publishTrace(bodyRequest);
            return CompletableResultCode.ofSuccess();
        } catch (IOException ignored) {
        } finally {
        }

        return CompletableResultCode.ofFailure();
    }

    public String writeToBinary(Marshaler marshaler) throws IOException {
        ByteArrayOutputStream outputStream = new ByteArrayOutputStream(marshaler.getBinarySerializedSize());
        marshaler.writeToBinary(outputStream);
        return outputStream.toString();
    }

    public String toString() {
        return "LoggingSpanExporter{}";
    }
}

For some reason, it always misses the traceId and spanId, and the APM server refuses to process the request with the captured payload.

Question
Is there a way to capture the full OTLP payload when the trace is exported? I don't want to map the trace information into ECS and then use the intake/v2/events API (is this possible)?

Francesco_Gualazzi · January 20, 2025, 3:06pm

Hello @anoobagain

Currently, we use OpenTelemetry to capture traces for our application, but we don't want to use a collector.

Can you elaborate on why using a collector is not suited?
The benefit of using a collector would be to manage data pipelines with all OTel (OpenTelemetry) contrib components.
e.g. there is a Kafka exporter that you could use to export traces to Kafka.

You could also configure your applications to send OTel traces directly to APM Server, by providing the right headers.

Instead, we want to publish the trace to Kafka and then use Logstash to push the trace to the APM server using the OpenTelemetry intake API (v1/traces ).

If you're already using Logstash, that might be ok, but if you aren't it seems unnecessary to add it to the mix.

For some reason, it always misses the traceId and spanId, and the APM server refuses to process the request with the captured payload.

I am not part of the Java client SDK team, i'll loop them in for help.

Question
Is there a way to capture the full OTLP payload when the trace is exported? I don't want to map the trace information into ECS and then use the intake/v2/events API (is this possible)?

I genuinely haven't understood the question here.
I hope the Java team can.

Jack_Shirazi · January 20, 2025, 3:54pm

You might find this exporter set implementation a more useful starting point

anoobagain · January 21, 2025, 1:05am

Hi @Francesco_Gualazzi,

Can you elaborate on why using a collector is not suited?

Currently, we are using the ELK stack, so we want to use Logstack instead of launching an OTEL collector.

You could also configure your applications to send OTEL traces directly to the APM Server by providing the right headers.

We tried this option, but due to high load, the APM server keeps spiking the document to Elastic, causing it to go down.

I genuinely haven't understood the question here.

We have two options for pushing traces to the APM Server.

The first option uses OTLP HTTP/Protobuf trace payload to v1/traces, which requires dumping trace information into the OTLP Protobuf payload, encoding it in UTF-8, and pushing it to Kafka. Then, Logstack pulls from Kafka and pushes it to the APM Server.

The other option is intake/v2/events, which uses a JSON payload following ECS (Elastic Common Schema). This requires mapping trace information into ECS, such as transaction and span data.

The question here is: Can we implement the first option, and is option 2 possible?

Jonas_Kunz · January 21, 2025, 8:28am

The first option uses OTLP HTTP/Protobuf trace payload to v1/traces , which requires dumping trace information into the OTLP Protobuf payload, encoding it in UTF-8, and pushing it to Kafka. Then, Logstack pulls from Kafka and pushes it to the APM Server.

I think that should be the route to go, though I'm not sure whether this will be possible via Logstash or if you'll need some custom app to pull the data from kafka and forward it to the APM-server.

The other option is intake/v2/events , which uses a JSON payload following ECS (Elastic Common Schema). This requires mapping trace information into ECS, such as transaction and span data.

I'd definitely not recommend that, the translation would be quite complex and with many pitfalls.

So for your original problem:
I'd recommend to debug and inspect what data you are sending to kafka. You can use opentelemetry-proto-java to decode and print your data after marshalling, after fetching it from kafka etc to see where things go wrong.
Ideally, you compare the data against a logging-otlp running in parallel.

One core problem I already see in your code is that you call ByteArrayOutputStream.toString(): The protobuf data is binary and not textual, therefore you can't just convert it to a string.

anoobagain · January 21, 2025, 10:15am

After marshaling, we obtain a byte array that represents the trace data in binary form.

Since we're using Kafka through Log4J2, the process involves encoding the binary data into a message. This message is then logged via Log4J2 and pushed to the appropriate Kafka topic. The use of toString() with the UTF-8 charset allows the binary data to be represented as a string for this purpose. Subsequently, the Logstack system retrieves the message, decodes it back into byte arrays, and then pushes the data to the APM using v1/traces API. This is the expected flow.

Jonas_Kunz · January 21, 2025, 11:32am

It is by no means guaranteed that protobuf binary data represents is UTF8-decodeable, you'll have to use some proper binary to string encoding like base64.

Topic		Replies	Views
Traces arrive to apm server but dont store on elasticseach APM open-telemetry	1	68	December 11, 2024
Cannot view traces using OpenTelemetry Collector in APM Elastic Observability otel	0	150	February 14, 2025
Can not export traces from opentelemetry collector to APM Server: request to https://apm.<removed>:8200/v1/traces responded with HTTP Status Code 400 APM docker , open-telemetry	4	2000	September 27, 2022
How to grab kafka traceparent and tracestate headers？ APM dotnet	5	1414	June 22, 2022
Apm-kafka-plugin seems to only support binary traceparent header APM java	3	486	February 14, 2023

Off load protobuf payload to Kafka

Related topics