APM agent for Java is causing system crashes

(Antuan) #1

The APM-Java is working but at some point Tomcat crashes.

A fatal error has been detected by the Java Runtime Environment:

Reason to crash:
SIGSEGV (0xb) at pc=0x00000005f0673940, pid=8545, tid=140480805185280

Active Thread (when app crashed):
apm-reporter - nativeThreadId:8574 - state:_thread_in_Java - threadType:JavaThread
Stacktrace:0x00000005f0673940

Red Hat Enterprise Linux Server release 6.6 (Santiago)
JAVA_HOME=/usr/java/jdk1.7.0_60
Apache Tomcat/7.0.27
-javaagent:/opt/elastic-apm-agent-1.3.0.jar

Let me know if there's anything else I can provide that would help.

(Felix Barnsteiner) #2

Hi,

I'm sorry to hear the Java agent causes trouble. Can you attach the hs_err_pid<pid>.log file (for example via gist.github.com)?

How often did this happen? Did it happen immediately or after the application ran for a while?

You are using Java 1.7.0_60. There are more recent updates in the 1.7 branch. I think the latest one is 1.7.0_80. It's worth updating to that version.

Best,
Felix

(Antuan) #3

Hi,

We had the previous version installed (1.2.0) without falls.
This problem has been reported in the last version.

The failure has occurred several times in a week.
The system had been running for more than 12 hours when it happened.

I have uploaded the log file to the following location:
hs_err_pid23331.log

Regards,
Antuan

(Felix Barnsteiner) #4

Hmm, that looks strange. That might be a JVM bug. Is it possible for you to use the latest 1.7 update release?

(Antuan) #5

I am sorry, it is not possible right now.

#6

Same problem here using elastic-apm-agent-1.3.0.jar ,am removing the agent for now.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00000007406c16f8, pid=19318, tid=140107825260288
#
# JRE version: Java(TM) SE Runtime Environment (7.0_71-b14) (build 1.7.0_71-b14)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.71-b01 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  0x00000007406c16f8
#
....
....
....
  0x00007f6d703e0000 JavaThread "apm-metrics-reporter" daemon [_thread_blocked, id=19337, stack(0x00007f6d65079000,0x00007f6d6517a000)]
=>0x00007f6d70367000 JavaThread "apm-reporter" daemon [_thread_in_Java, id=19336, stack(0x00007f6d6517a000,0x00007f6d6527b000)]
  0x00007f6d7033e800 JavaThread "apm-request-timeout-timer" daemon [_thread_blocked, id=19335, stack(0x00007f6d65584000,0x00007f6d65685000)]
....
....
....
Internal exceptions (10 events):
Event: 49914.327 Thread 0x00007f6ce7667000 Threw 0x00000007db227018 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49924.330 Thread 0x00007f6ce7667000 Threw 0x00000007db60f990 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49934.333 Thread 0x00007f6ce7667000 Threw 0x00000007db614d98 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49944.336 Thread 0x00007f6ce7667000 Threw 0x00000007db61a1a0 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49954.340 Thread 0x00007f6ce7667000 Threw 0x00000007db7fca20 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49964.343 Thread 0x00007f6ce7667000 Threw 0x00000007db801e28 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49974.346 Thread 0x00007f6ce7667000 Threw 0x00000007db807230 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49984.349 Thread 0x00007f6ce7667000 Threw 0x00000007db89e9b8 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 49994.352 Thread 0x00007f6ce7667000 Threw 0x00000007db8a3dc0 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
Event: 50004.355 Thread 0x00007f6ce7667000 Threw 0x00000007db8a91c8 at /HUDSON/workspace/7u-2-build-linux-amd64/jdk7u71/1605/hotspot/src/share/vm/prims/jni.cpp:743
...
...
...

It looks like this agent may not work well with Oracle Java 7 JVM at least at version 1.7.0_71-b14. It looks like the latest Oracle Java 7 version available is 7u80, which is several years old, and the rest after that are only available to Oracle customers. I see the apm-agent is tested against the latest OpenJDK 7 distribution, namely 7-jre-slim. If the integration test is just making sure that basic reporting is able to be run, then it may not even catch this issue as it's been reported that the agent runs for a while and then crashes after about a couple of days. I think best bet is to switch to a newer version of Java in order to work with the latest JVM agent as it's unreasonable for the JVM agent maintainer to check for backward compatibility for such an old version of Java even though some folks still using it :wink:

APM Agent maintainers, is there some documentation on how the apm-reporter agent is tested before it is released?

I see these TomcatIT tests and I'm wondering if these are the only tests or if there are other ones too.

Thanks.

(Eyal Koren) #7

Hi and thanks for reporting.
Please see https://github.com/elastic/apm-agent-java/issues/458 - we are still trying to figure out what is the root cause of this problem.

(Eyal Koren) #8

@antuan @20q 1.4.0 is released, containing a fix for this issue

(Antuan) #9

That is great.

(system) closed #10

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.