Hi,
I try to use the apm-java-attach on AWS Lambda openjdk-8 however I am getting "Attaching the Elastic APM Java agent is not possible on this JVM" .
Looks like the APM attach is not working on the AWS Lambda OPENJDK 8 ?
Hi,
I try to use the apm-java-attach on AWS Lambda openjdk-8 however I am getting "Attaching the Elastic APM Java agent is not possible on this JVM" .
Looks like the APM attach is not working on the AWS Lambda OPENJDK 8 ?
Welcome to the forum and thanks for you question!
Probably this is because the Java runtime used to run your Lambda is a JRE, rather than a full JDK. In any case, it seems it does not include a tools.jar
(at least not a valid AttachProvider
).
Is this something you can control?
Hi , thanks for replying.
I've sorted out the issue already.
Yup, so by default, the AWS Lambda OpenJDK provided by AWS doesn't incl. the tools.jar in the classpath. I have to provide it and then set the property "co.elastic.apm.attach.bytebuddy.agent.toolsjar" with path to the tools.jar.
It all works fine now, however, I am having small issues with the reporting.
I need to flush all the message / buffers in the agent to the APM server before the Lambda Handler method returns.
Are you aware of any way in the agent code that allows me to manually flush all the transaction/span to the APM server?
Yeah, that's another issue we need to see how we tackle with regards to AWS Lambda and the like...
Due to the costs of compression and transmission, we send batched events to APM server in the background. In addition, we attach common metadata once per batch, so this is another optimization. So the optimal would be to find a way to utilize the existing communication mechanisms, but that depends on the extent of control you have over the JVM.
There is no way to flush events manually, but one thing you can try is set the api_request_size
configuration option to a very low value and wait some time after ending the traced transaction before letting the Lambda Handler return. This configuration sets the maximum time we allow to accumulate events before sending them. However, note that this may incur considerable overhead, mostly on the JVM but also on the APM server.
In general, we made lots of efforts to make sure we don't do any heavy blocking actions on the request-handling thread, and this defies this principal.
Just some questions out of curiosity:
attach
API as well within the Lambda or as an initiation thing?Let me answer your questions first:-
As I mentioned earlier, I am currently using the "report_sync" config and it seems to be doing the job reporting the value, however it will for sure impact the response time for the APIs (from the client perspective).
I will give api_request_size a go and see how well it can do. Let me know if you have a better idea.
Right- this is blocking the response.
I put the right link, but with the wrong config option, I meant api_request_time
!
But note that this will help you only if you can release the response back to the client and then wait within the handler. Otherwise this is essentially blocking as well.
But are you using it each time the Lambda is invoked, or as some kind of JVM initiation?
The way the AWS Lambda works as the code will init once and stays in memory. So, I the attach() will only happen once as per JVM instance.
OK. I will use the api_request_time and gives it some times before the method ended and see how it goes.
Thanks for your great help in this matter. Very very much appreciated.
My guess is if the APM Reporting is able to allow manual flushing the message to the server , i am sure that will solve my issue. (similar to logger flush method) . I might look at the agent report source code and see if it is doable.
Thanks for the feedback! If that's a valid option (ie can be called without blocking the response) with acceptable overhead - we can certainly consider it (or something of that sort).
This may still be used for batching as well, for example- if you invoke your service synthetically once every 10 seconds with a marked request that tells the service to invoke blocking flushes.
I think we can't assume that the program is able to flush the report on its own thread in the Lambda world. Even with the 'api_request_time' approach, there is no 100% guarantee that the thread is able to stay alive to do that task.
My preferred approach is to have the reports flush (in blocking) manually if needed. We can do whatever cache/batching we need as usual, but the flush method can just flush all the span/transaction reports in memory to the APM server at will.
I certainly think the Lambda world will require some creativity and adjustments
That's what I tried to suggest. I am looking for a way to use the "manual flush" for batching. If you have a way to configure scheduled executions (maybe using CloudWatch events?) - that's great.
Otherwise, you will need to cache and flush periodically on a Lambda-executing thread. For that, I can (currently) see two options, both require some logic in the Lambda code:
The first is easier to implement, but it has no guarantee about flush timings. The second requires, in addition to changes in the service code, something that creates periodic requests. Luckily, we have Uptime .
And maybe- manually flushing after each execution will not be that bad even at 1k requests per second.
I will build a quick (and naive) implementation for a blocking manual flush and let's see how it works out.
yup. awaiting for your glory return (of the code change).
Hi again
Please download this API snapshot jar and this agent snapshot build and try them out.
I added a co.elastic.apm.api.ElasticApm#blockingFlush
API.
Note that this is just a quick test build, not implemented in the way we eventually would want to implement that but should be good enough to get some feedback.
Will be waiting for your feedback on that.
Thanks,
Eyal.
wow... you are super quick!...
I will give it a go tomorrow and see how it goes. and report it back. stay tuned
Hey!
Any updates? Any feedback on this would be very useful for us - basic tracing functionality, overhead, other functionality (eg how does the Metrics tab look and does it makes sense to you?).
Thanks,
Eyal.
Hey Eyal,
Thanks for keep an eye on this. I have been cop up with production issue in the past few days and haven't got a chance to use the snapshot jar. But i will definitely do it in the weekend and let you know how it goes.
@f136989c Hey!
Did you get a chance to test this?
hey @Eyal_Koren i finally got a chance to test this using the snapshot. It seems to be working fine with the blocking flush.
what's the plan now? i am happy to use the snapshot for now.
Hi again Johnny!
Great, so please do for now. The only limitations are that you won't be able to upgrade until we implement that officially and once we do, you may need to adjust your code to use the new API.
To open a GitHub issue that will include the "official" blocking flush implementation and API. However, I would like this issue to include the documentation of what's needed in order to properly trace AWS Lambda. For that, if you can spend the time to provide the following info, that would be incredible:
The more you elaborate on those, the more you can make this useful for future users following your steps.
Thanks a lot for your great feedback!!
Eyal.
This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.