Kibana version : 7.9.2
Elasticsearch version : 7.9.2
APM Server version : 7.9.2
APM Agent language and version : JAVA apm-agent-attach:1.17.0, apm-agent-api:1.17.0
Java version : openjdk:11
We have Spring-boot application with the following APM config:
service_name=app-name
server_urls=http://apm-server-ops.app.com
transaction_sample_rate=1
application_packages=com.app.package
ignore_urls:/actuator*, /swagger-ui*
enable_log_correlation=true
transaction_max_spans=1000
span_min_duration=500ms
I have struggled with this warning (originally reported here )
2020-09-17 14:15:30,649 [DefaultMessageListenerContainer-2] WARN co.elastic.apm.agent.impl.transaction.Span -
Max spans (500) for transaction 'JMS RECEIVE from queue Consumer.SmsServiceImpl.VirtualTopic.****CreatedEvent' 00-6efa3ce85265c5f6d3f8d53feed3d11f-acdeb259e07a8286-00 (5084b178) has been reached.
For this transaction and possibly others, further spans will be dropped. See config param 'transaction_max_spans'.
based on recommendation i've increased transaction_max_spans
& span_min_duration
params. However I still see these warnings
Generally, I can say that such huge spans (more than 1000 with current config) do not fit in our use case:
The application is not very loaded and it processes 1-2 HTTP requests per 30sec for an entity creation (persists in the DB). Once the entity is created the application emits an event to ActiveMQ which is picked up by five-six ActiveMQ listeners which live in the same application. Listeners do some stupid job like: send sms, email, perform tracking. Every listeners does max 2-3 sql request + calls some http service.
After some investigation I see that quite some ActiveMQ transactions are grouped under one incorrect transaction id (trace id is also the same):
here is filter of logs by transaction.id from our kibana over 4 days:
as you can see, it finds over 600 (!) log entries from totally different unrelated ActiveMQ events, which are published upon creation of totally different entities, so should not share neither trace.id nor transaction.id.
From my understanding if I create entity A and persist it into DB and emit corresponding EntityCreatedEvent which is picked up by several ActiveMQ listeners, then only this flow should be grouped by one trace.id and should not be mixed up with flow caused by creation of entity B (they should be grouped under different trace.id). Then every event listener (Listener-1, Listener-2, etc) creates a separate transaction (unique transaction.id), so I have:
transaction.id: Http-Session-1 trace.id: event-A
transaction.id: Listener-1-event-A-1 trace.id: event-A
transaction.id: Listener-2-event-A-2 trace.id: event-A
transaction.id: Listener-3-event-A-3 trace.id: event-A
transaction.id: Http-Session-2 trace.id: event-B
transaction.id: Listener-1-event-B-1 trace.id: event-B
transaction.id: Listener-2-event-B-2 trace.id: event-B
transaction.id: Listener-3-event-B-3 trace.id: event-B
but in my Kibana (as in the screenshot) i see basically that
transaction.id: Listener-1-event-A-1 trace.id: event-A
transaction.id: Listener-1-event-B-1 trace.id: event-B
are being tracked by the same trace.id and transaction.id
And this logically causes more than 1000 spans joining one (incorrect) transaction id.
Is it a bug on apm agent or my assumptions are not correct?