Java Agent : Traces not getting stored in Elasticsearch

If you are asking about a problem you are experiencing, please use the following template, as it will help us help you. If you have a different problem, please delete all of this text :slight_smile:

Kibana version: 8.5.1

Elasticsearch version: 8.5.1

APM Server version: 8.5.1

APM Agent language and version: Java 1.36.0

WildFly version (formerly known as JBoss): 20.0.0

Original install method (e.g. download page, yum, deb, from source, etc.) and version: from source

Fresh install or upgraded from other version? Fresh install

Is there anything special in your setup? For example, are you using the Logstash or Kafka outputs? Are you using a load balancer in front of the APM Servers? Have you changed index pattern, generated custom templates, changed agent configuration etc.

Agent Configuration File -:

transaction_max_spans=10000
environment=production
service_name=xyz
span_stack_trace_min_duration=0ms
stack_trace_limit=-1
recording=true
instrument=true
server_url=apmServerIP
span_frames_min_duration_ms=-1
application_packages=com.xyz
trace_methods=com.xyz.services.*
classes_excluded_from_instrumentation=org.json.*,com.blogspot.*,net.bytebuddy.*,com.fasterxml.jackson.*,io.jsonwebtoken.*,org.jsoup.*,org.springframework.*,org.jboss.*

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant):
We are having a microservices architecture and using JBoss as application server where we have attached the java agent via standalone.bat for monitoring and capturing traces for our application.

When we are performing transactions on application UI, only scheduled type transaction are getting captured by the agent and same are displayed on Kibana, no request type transactions are captured by the java agent instead transaction -> 2#run is getting captured every time, refer below screenshot.

service.framework.name: TimerTask
service.language.name: Java
service.language.version: 11.0.22
transaction.name: 2#run
transaction.name.text: 2#run
transaction.sampled: true
transaction.span_count.dropped: 0
transaction.span_count.started: 0
transaction.type: scheduled

Sometimes restarting the application server works and we get traces for http requests but most of the time its 2#run.

Which JBoss version are you using? Is it listed on the supported technologies page?

Also please provide APM agent debug logs so that we can further analyze your case.

Hello @Jonas_Kunz
I'm using Wildfly 20.0.0Final, yes it is in supported technologies page.
Logs File -> APM Agent Debug Logs

The provided log shows that several web-request transactions are getting captured in addition to the 2#run timer transaction. You can search for endTransaction to find those, e.g.

2024-04-02 18:36:57,959 [default task-1] DEBUG co.elastic.apm.agent.impl.ElasticApmTracer - endTransaction 'LoginHelperServlet#doGet' 00-bf79b18ae4248dbaf037a9f641182e0a-f960c29be94b34c4-01 (778fd0)

So it seems like the data is being lost at the APM-server. Did you see anything in the logs there?

BTW if you don't want to record your background task (2#run) as a transaction you can add timer-task to the disable_instrumentations config.

At APM server I'm getting logs for any configuration change(when agent checks for dynamic configuration) only.

In some of logs of APM agent I got read timeout earlier

[Server:main-server] 2022-12-28 16:10:03,888 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving METRICSET_JSON_WRITER event (sequence 28)
[Server:main-server] 2022-12-28 16:10:03,889 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.util.UrlConnectionUtils - Opening http://ELASTICKSERVER:8200/intake/v2/events without proxy
[Server:main-server] 2022-12-28 16:10:03,889 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Starting new request to http://ELASTICSERVER:8200/intake/v2/events
[Server:main-server] 2022-12-28 16:10:03,892 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Scheduling request timeout in 10s
[Server:main-server] 2022-12-28 16:10:03,892 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving METRICSET_JSON_WRITER event (sequence 29)
[Server:main-server] 2022-12-28 16:10:03,892 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving METRICSET_JSON_WRITER event (sequence 30)
[Server:main-server] 2022-12-28 16:10:03,892 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving METRICSET_JSON_WRITER event (sequence 31)
[Server:main-server] 2022-12-28 16:10:03,892 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving METRICSET_JSON_WRITER event (sequence 32)
[Server:main-server] 2022-12-28 16:10:03,893 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving METRICSET_JSON_WRITER event (sequence 33)
[Server:main-server] 2022-12-28 16:10:12,278 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Beginning scheduled configuration reload (interval is 30 sec)...
[Server:main-server] 2022-12-28 16:10:12,280 [elastic-apm-configuration-reloader] DEBUG co.elastic.apm.agent.impl.ElasticApmTracerBuilder - Finished scheduled configuration reload
[Server:main-server] 2022-12-28 16:10:13,896 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Receiving WAKEUP event (sequence 34)
[Server:main-server] 2022-12-28 16:10:13,896 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Request flush because the request timeout occurred
[Server:main-server] 2022-12-28 16:10:13,896 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Flushing 2257 uncompressed 786 compressed bytes
[Server:main-server] 2022-12-28 16:10:18,901 [elastic-apm-server-reporter] WARN  co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - null
[Server:main-server] 2022-12-28 16:10:18,901 [elastic-apm-server-reporter] INFO  co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Backing off for 16 seconds (+/-10%)
[Server:main-server] 2022-12-28 16:10:33,555 [elastic-apm-server-reporter] ERROR co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Error sending data to APM server: Read timed out, response code is -1
[Server:main-server] 2022-12-28 16:10:33,556 [elastic-apm-server-reporter] DEBUG co.elastic.apm.agent.report.IntakeV2ReportingEventHandler - Sending payload to APM server failed
[Server:main-server]        at co.elastic.apm.agent.report.AbstractIntakeApiHandler.endRequest(AbstractIntakeApiHandler.java:187) [elastic-apm-agent-1.35.0.jar:1.35.0]
[Server:main-server]        at co.elastic.apm.agent.report.AbstractIntakeApiHandler.endRequest(AbstractIntakeApiHandler.java:163) [elastic-apm-agent-1.35.0.jar:1.35.0]
[Server:main-server]        at co.elastic.apm.agent.report.IntakeV2ReportingEventHandler.onEvent(IntakeV2ReportingEventHandler.java:85) [elastic-apm-agent-1.35.0.jar:1.35.0]
[Server:main-server]        at co.elastic.apm.agent.report.IntakeV2ReportingEventHandler.onEvent(IntakeV2ReportingEventHandler.java:38) [elastic-apm-agent-1.35.0.jar:1.35.0]
[Server:main-server]        at com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:168) [elastic-apm-agent-1.35.0.jar:1.35.0]
[Server:main-server]        at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125) [elastic-apm-agent-1.35.0.jar:1.35.0]
[Server:main-server]        at co.elastic.apm.agent.util.ExecutorUtils$2.run(ExecutorUtils.java:99) [elastic-apm-agent-1.35.0.jar:1.35.0]

so I have added these changes in agent configuration file

server_timeout=0s
api_request_time=120s
api_request_size=2mb

and in apm server yml

# Maximum permitted duration for reading an entire request.
  read_timeout: 125s

  # Maximum permitted duration for writing a response.
  write_timeout: 125s

  # Maximum duration before releasing resources when shutting down the server.
  #shutdown_timeout: 5s

  # Maximum permitted size in bytes of an event accepted by the server to be processed.
  max_event_size: 30720000

Are these configurations okay?

After these changes I'm getting request/bulk request accepted logs and traces in elasticsearch but they are inconsistent.