Max spans has been reached

Kibana version: 7.17.0
Elasticsearch version: 7.17.0
APM Server version: 7.17.0
APM Agent language and version: Java 8, agent version 1.28.4

Description of the problem:
Nomatter that value (up to 2147483647) i put into "transaction_max_spans" properties i get WARN in the app output:

{"@timestamp":"2022-02-16T09:53:54.309Z", "log.level": "WARN", "message":"Max spans (2147483647) for transaction 'someclassnamehere#start' 00-e52f912e1a1185766deeae7ce9f8b76a-88bdcb02f7e163ed-01 (5a1c0881) has been reached. For this transaction and possibly others, further spans will be dropped. See config param 'transaction_max_spans'.", "ecs.version": "1.2.0","event.dataset":"someservicenamehere.apm","process.thread.name":"scheduling-1","log.logger":"co.elastic.apm.agent.impl.transaction.Span"}

Does it mean that my transaction produce a very enormous spans (which i doubt) that can`t be recorded? Is there a way to record(mb partially) a transaction and show it in Kibana APM or at least to debug the issue and find a root cause?

These are the two main causes for such issues:

  1. There is a transaction leak, meaning - there are transactions that somewhere have reference that do not get cleared, so spans keep getting created for them. If those are transactions that are never be ended, they won't even show in Kibana. If you create and manage transactions manually (i.e. through the public API or using the OpenTracing bridge), it is most probably there. If not and there is indeed a leak - it could be an agent bug we need to look into.
  2. The transactions are ended and cleared properly, but there is indeed a very large number of spans created for each. If you set the trace_methods config - look in there and narrow it. If you only rely only on the inherent agent instrumentations, we may want to see why that is.

The best way to get to the root cause is through a debug log (see logging configurations).

If this is caused only by agent-inherent instrumentations, we would like to hear all about that - including what exact instrumentation is related, which technology/library and version, and of course your debug log that includes from startup and until you get one or two of these log events.

I hope this helps

Hello Eyal,
Thank you for the fast feedback.
We don`t use the public API or the OpenTracing bridge.
SpringBoot 2.3.0

Problematic transaction: DeliveryNoteGoodsDEConsumer#start
First appearence of "...Max spans (500) for transaction..." is about 45 min after service startup

debug log output:
https://raw.githubusercontent.com/scroodj/apm/main/transcation_spans_debug.txt

OK, then it is a real issue of creating too many spans and a never ending transaction that traces a scheduled method.

What is your DeliveryNoteGoodsDEConsumer#start annotated with? Is it org.springframework.scheduling.annotation.Schedules or something else?

Normally, our instrumentation of such methods assumes that they start and end, letting the framework that runs them to reschedule them every time they finish. Is it possible that this method runs an endless loop?

Is it org.springframework.scheduling.annotation.Schedules or something else?

  • Yes, It`s org.springframework.scheduling.annotation.Scheduled

Is it possible that this method runs an endless loop?

  • Yes, It runs for ever.

In that case, the agent behaves as expected. I'd say running an endless loop within a Scheduled task defeats the usage of scheduling, which is to let something else measure the timings and do the sleeps between method invocations.
If your Scheduled method would just run its task and return, you would get meaningful transactions. With your current implementation, these transactions will only be meaningless, they will never end and never be reported.
You can disable this instrumentation by setting disable_instrumentations=scheduled and use the public API to monitor some other method that encapsulates the recurring task and does return after every task execution.

I hope this helps.

Thank you for shedding a light.

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.