In testing the latest version of the java agent the following error appeared...
2021-12-13 12:59:25,268 elastic-apm-server-healthcheck ERROR No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See Log4j โ Configuring Log4j 2 for instructions on how to configure Log4j 2
Is there additional configuration required now that the log4j security vulnerability has been removed in order to supress this error? If not should it appear as a WARN?
@coynef is there a forecast for this fix?
If we use the current 1.28.1 with this bug, will it break anything or will it just log an error and continue?
@coynef a verification that the fix resolves the problem and everything is working as expected would be great before we merge and release it.
Thanks a lot!
It may have side effects if your application uses log4j as well, but most likely only agent logs are affected, if at all. If you didn't see any problems then you should be safe. If you verify the fix, we can release something that will even eliminate this small risk.
My question whether everything works as expected is because I changed the way the agent configures log4j, not because I expect it to cause issues in 1.28.1
@Eyal_Koren I am concerned here that we have released this to production and it "may have side affects" . Are these releases not thoroughly tested on your end prior to being released?
Everything is thoroughly tested on our side, for every PR, daily snapshot and releases.
You are welcome to review what we test even in the snapshot build I provided above. Lots of those describe multiple test applications running on lots of Servlet containers and scenarios.
However, we cannot test all possible permutations of instrumented libraries, configurations, setups etc.
So in this case, only if the application we trace is using log4j, the agent logging system may be affected by it. For example, it may use the application configuration and agent logs can be affected. In very specific configuration, for example if you configure your log4j to send logs to Kafka, there may be an effect on the application, but if you didn't see anything so far, you should be OK.
Most importantly, we assume we already have a fix for that and what could really help is your verification on that. It will not only eliminate the error in log, but also these rare side effects.
In general, since the agent is observing classes when they are loaded to decide whether they should be instrumented, there is a startup delay that comes with it. We try to keep that minimal by caching discovered data and by eliminating unnecessary class checks wherever possible.
Can you measure the difference in the startup delay between this version and the former?
If you set log_level to debug, let the JVM fully start, make a request or two, terminate it and share the full log, we should be able to say where most of this time was spent. Maybe there is something we can optimize there.
In addition, if your JVM consumes a lot of its heap when starting, increasing the heap size should accelerate startup.
Hi @coynef , we just released the 1.28.2 agent version that should fix this log4j2 configuration issue.
Thus you should be able to replace the snapshot version with this release now.
For the slow startup time, it's still under investigation. Make sure to subscribe to the issue for progress on this side (this conversation will self-close after some time of inactivity).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.