JVM Crash with APM

Hello there,

For some reason, it seems that both methods (manual and automatic), crash the JVM of my JBOSS applicattion. This applicattion runs Java 7. I have tried running both agents with Java 8 and the same Java as the JBOSS applicattion uses.

Kibana version : 7.9.0

Elasticsearch version : 7.9.2

APM Server version : 7.9.2

APM Agent language and version : Java Oracle 8, 1.18.0

Browser version : Google Chrome 80.0.3987.149

Original install method (e.g. download page, yum, deb, from source, etc.) and version : The agents were installed from the following links:

Fresh install or upgraded from other version? Fresh install

Is there anything special in your setup? The communication between the agent and the APM server is plain within an internal network. The problem is the agent won't attach to the Java VM, so it has nothing to do with a communication issue with APM server.

Description of the problem including expected versus actual behavior. Please include screenshots (if relevant) :

/apps/bin/comerzzia/jboss-as-7.1.1/bin/standalone.sh: línea 178: 14707 Aborted                "/apps/bin/comerzzia/jdk1.7.0_80/bin/java" -D"[Standalone]" -se
rver -XX:+UseCompressedOops -XX:+TieredCompilation -Duser.timezone=Europe/Madrid -Xms8192m -Xmx8192m -XX:MaxPermSize=2048m -Djava.net.preferIPv4Stack=true -Dorg.jboss.resolver.warning=true 
-Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true -Djboss.server.default.config=standalo
ne.xml -Dfile.encoding=UTF-8 -Dorg.apache.catalina.connector.URI_ENCODING=UTF-8 -Dclient.encoding.override=UTF-8 -XX:+TieredCompilation -XX:ReservedCodeCacheSize=256m -XX:+UseCodeCacheFlush
ing "-Dorg.jboss.boot.log.file=/apps/bin/comerzzia/jboss-as-7.1.1/standalone/log/boot.log" "-Dlogging.configuration=file:/apps/bin/comerzzia/jboss-as-7.1.1/standalone/configuration/logging.
properties" -jar "/apps/bin/comerzzia/jboss-as-7.1.1/jboss-modules.jar" -mp "/apps/bin/comerzzia/jboss-as-7.1.1/modules" -jaxpmodule javax.xml.jaxp-provider org.jboss.as.standalone -Djboss.
home.dir="/apps/bin/comerzzia/jboss-as-7.1.1"

You will find the JVM crash log here: https://pastebin.pl/view/b7c86f3f

According to the Supported technologies (https://www.elastic.co/guide/en/apm/agent/java/current/supported-technologies-details.html#supported-java-versions) Java 7 and Java 8 both are supported.

Is this happenning because of a bug? Is there any additional requirement I am not bearing in mind?

Thanks in advance,

This seems to be a JVM bug. Have you tried updating your JVM to a more recent update of Java 7?

Does the crash happen right after startup or only after a while?

Hello Felix Barnsteiner,

Thank you for your quick response!

This is the current version of Java that the application and the agent use:

[root@company jboss-as-7.1.1]# /apps/bin/comerzzia/jdk1.7.0_80/bin/java -version
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

The crash happens after a while. The JBOSS application runs for some minutes, and then suddenly the process is aborted and the JVM is crashed.

Since version 1.18.0, we're using a new architecture in the agent that relies on the invokedynamic instruction. Early versions of Java 7 (prior to update 60) and Java 8 (prior to update 40) are known to have bugs in these areas. That's why we don't support them.

So far, we haven't heard about issues with Java 7 update 80. But quite likely, your issue is something that's fixed in more recent update versions.

If you cannot update your Java, try removing these JVM flags which may interfere with invokedynamic: -XX:+UseCompressedOops -XX:+TieredCompilation

Hello Felix,

I removed both options from the java deploy command. The server ran for 1h 40min approximately and then the JVM was crashed.

What would you recommend me then, waiting for a fix? Do you think that the only chance I got, so far, is upgrading to the latest Java 7?

Thanks for your help,

As this seems to be a JVM bug, there's not much we can do in the agent. So yes, so far, it seems like updating to the latest Java 7 is the only fix.

Okay, then. Thank you so much for leting me know.

Just wondering, could you send us the agent configuration (without any sensitive data in it) ?
Have you enabled sampling profiler by chance ? https://www.elastic.co/guide/en/apm/agent/java/current/config-profiling.html#config-profiling

Hello Sylvain,

The command I am using to launch the attached agent is the followind:

sudo -E -u jboss /apps/bin/comerzzia/jdk1.7.0_80/bin/java -jar $(pwd)/apm-agent-attach-1.18.0-standalone.jar --pid $PIDFILE --config service_name=comerzzia-pre --config server_urls=http://192.168.250.114:8200 --config application_packages=bin.comerzzia,org.apache.jsp.backoffice,org.jboss.as --config log_file=/apps/bin/comerzzia/jboss-as-7.1.1/standalone/log/apm.log --config log_level=DEBUG

As you can see the options are:

· service_name
· server_urls
· application_packages
· log_file
· log_level

I don't deliberately have enabled the option of sampling profiler.

Hope this sheds some light on the issue.

Thanks, that allows to ensure it's not related to any experimental feature. We'd better check twice than not :-).

Sure Sylvain, thank you!

Could you try to run with version 1.17.0?

Hello,

I downloaded the 1.17.0 version of the agent from the following link: https://repo1.maven.org/maven2/co/elastic/apm/apm-agent-attach/1.17.0/apm-agent-attach-1.17.0-standalone.jar and apparently, the Java process has not died yet, which is great:

[root@company jboss-as-7.1.1]# ps -C java -o pid,etime,cmd
  PID     ELAPSED CMD
28028    09:20:56 /apps/bin/comerzzia/jdk1.7.0_80/bin/java -D[Standalone] -server -Duser.timezone=Europe/Madrid -Xms8192m -Xmx8192m -XX:MaxPermSize=2048m -Djava.net.preferIPv4Stack=true -Dorg.jboss.resolver.warning=true -Dsun.rmi.dgc.cli
[root@company jboss-as-7.1.1]# 

The only problem is that the Transactions in Kibana's APM web page is empty.


These errors were not appearing before. Do I have to downgrade the APM server version? Or is it to do with the apm-server configuration.

The apm-server logs are the following:

oct 02 22:27:43 SRVHIDS1 apm-server[23462]: 2020-10-02T22:27:43.882+0200        INFO        [publisher_pipeline_output]        pipeline/output.go:151        Connection to backoff(elasticsearch(http://192.168.250.114:9200)) established
oct 02 22:28:21 SRVHIDS1 apm-server[23462]: 2020-10-02T22:28:21.805+0200        INFO        [request]        middleware/log_middleware.go:97        request accepted        {"request_id": "a07b2586-4a2b-40ca-b311-c24fb983ba3d", "method": "POST", "URL": "/intake/v2/events", "content_length": -1, "remote_address": "192.168.250.29", "user-agent": "elasticapm-java/1.17.0", "response_code": 202}
oct 02 22:28:51 SRVHIDS1 apm-server[23462]: 2020-10-02T22:28:51.810+0200        INFO        [request]        middleware/log_middleware.go:97        request accepted        {"request_id": "fe94be13-03d0-4ecb-baaf-f8d412425c9f", "method": "POST", "URL": "/intake/v2/events", "content_length": -1, "remote_address": "192.168.250.29", "user-agent": "elasticapm-java/1.17.0", "response_code": 202}
oct 02 22:29:21 SRVHIDS1 apm-server[23462]: 2020-10-02T22:29:21.804+0200        INFO        [request]        middleware/log_middleware.go:97        request accepted        {"request_id": "a129b39c-dfc4-4da9-ae2b-d38a4e8440a4", "method": "POST", "URL": "/intake/v2/events", "content_length": -1, "remote_address": "192.168.250.29", "user-agent": "elasticapm-java/1.17.0", "response_code": 202}
oct 02 22:29:51 SRVHIDS1 apm-server[23462]: 2020-10-02T22:29:51.814+0200        INFO        [request]        middleware/log_middleware.go:97        request accepted        {"request_id": "ae94d755-0d1a-44b3-b4dc-a16cd8fed063", "method": "POST", "URL": "/intake/v2/events", "content_length": -1, "remote_address": "192.168.250.29", "user-agent": "elasticapm-java/1.17.0", "response_code": 202}

My apm-server configuration is the following:

[root@SRVHIDS1 ~]# cat /etc/apm-server/apm-server.yml | grep -v '#' | sed -r '/^\s*$/d'
apm-server:
  host: "192.168.250.114:8200"
  setup.kibana.host: "192.168.250.114:5601"
  setup.dashboards.enabled: true
  logging.level: info
  logging.to_files: false
  ssl:
    enabled: false
  kibana:
    enabled: true
    host: "192.168.250.114:5601"
output.elasticsearch:
  hosts: ["192.168.250.114:9200"]

Thank you for all your support,

Any thoughts on this?

I have also tried with apm-server versions 7.6 7.7 and 7.8, and still transactions page is empty. Any thoughts on this?

This seems to be a UI or server issue. I recommend creating a new thread and tag it with ui and server to get help from the experts.

Okay! Thank you very much Felix!

@Jose_Angel_Morena_Si we are trying to figure out what caused this JVM crash, so that we know what to look for (and so that you can upgrade your agent at some point :slight_smile: ).

Removing -XX:+UseCompressedOops -XX:+TieredCompilation did not help before, however, I now see that you have -XX:+TieredCompilation twice in the command line. If you did not remove both occurrences last time you tried, would you try again with agent 1.18.1 and removing -XX:+UseCompressedOops and both -XX:+TieredCompilation occurrences?

If you already set this up for test and it doesn't solve the problem, I would also try without -XX:+UseCodeCacheFlushing, even though it doesn't seem directly related to a crash.

Hello Eyal,

Thanks for your suggestions, but I am sorry to say that the JVM still crushes :pensive:.

Is there anything information I can provide you that can help you solve this issue?

@Jose_Angel_Morena_Si Thanks for testing and reporting back!
I think the best input you can provide is whether this crash still happens with the latest Java 7 build.
Such crashes usually indicate there's a JVM bug that is being triggered by the addition of the agent to the existing setup. Starting 1.18.0, our agent is heavily dependent on the invokedynamic bytecode instruction, which is known to be related to bugs in earlier Java 7 JVMs.