Why is the integration of the different agents so cumbersome?

Hi,

I played around with some of the APM agents, namely Java, Python, and RUM and had a look at the docs of the others.

The relatively easiest to integrate is the Java one, besides one has to know which packages are used. So far, so fine, but there is room for improvement with regards to autodetection.

RUM is okay to integrate, if you have an html file you can change, where you put the script thingy. But to integrate it in an Java application, which you don't know, is hard. Python works slightly better in this regards. I imaging, it is nearly the same for all the other platforms.

Instrumenting Python is hard, which is caused by the often unclear and ever changing documentation, at least this is my impression.

I can compare to some APM competitors and they have autodetection and much easier instrumentation of methods in their Java agents. Also, they have the RUM agent integrated in their Java agents, which makes it's use so much more accessible.

The integration of the python agent work somehow, auto instrumentation should be extended by far. But even elastic's competitors rely on, at least some, manual instrumentation.

The integration of the other agents is much worse compared to competitors.

All in all, why is the integration of elastic's agents so cumbersome and hard, the competitors show that another world is possible.

Best regards,
Robert

I think you're raising some valid points here. A big area of focus for us going forward is to improve the getting started experience so your feedback is much appreciated.

Note that the application_packages, and even service_name are optional configuration options for the Java agent. We could do a better job pointing that out in the documentation and examples.

Can you elaborate what's particularly hard for Java applications? Is it to include the RUM agents in generated HTML pages of frameworks like JSF or VAADIN? Or are you talking about the integration of, for example, React frontend applications and Java backends?

Would it be helpful if the Java agent would inject the RUM agent into the HTML pages it serves? Or do you prefer to manually add the RUM agent to your frontend application that has it's own repo so that you can use the RUM agent's API to customize spans and transactions?

What works better in Python?

Could you comment on what was particularly confusing or difficult?

Are you referring to missing auto-instrumentation for particular frameworks or libraries? If yes, which are missing for you?

Or do you mean that you'd prefer not having to do code changes at all in order to integrate the python agent into your applications?

Thanks for your feedback,
Felix

4 Likes

Okay, perhaps the explanation was extended since I read it the first time. It's quite comprehensive already.
But, you should mention more clearly, that trace_methods is necessary to see anything. I missed that before.

If you are not familiar with the application, it is not always clear, where to put the file and its call. Plus, you have to recompile the app, presumably, I didn't succeed to just change the jsp file for one example app.

I would greatly prefer to have the Java and the other agents injecting the RUM agent. Additionally with the ability to configure it via the injecting agent using the Kibana APM app.

The manual injection of the RUM agent as one can change the source code without recompiling. But I didn't test it with framework create their HTML dynamically, yet.

I didn't understand, how the auto-instrumentation is supposed to be used or how it works. I imagined that all code is instrumented somehow.

  • werkzeug
  • urllib[2,3]
  • some other low level modules

It would be great in general, not to have to change the source code. But in case of python, even the bigger players didn't achieve this.

Thank you for your interest.
Robert

By default, our agents just instrument well-known entry and exit points such as incoming HTTP requests and DB calls. If you want to get more method-level insight, check out this guide: How to find slow methods | APM Java Agent Reference [1.x] | Elastic

Hi,

One addition with regards to .NET integration:
It is possible to use byte code injection for .NET, competitors do this since ever. Why not elastic? It is almost never feasible or even possible to manually instrument .NET applications.

Best regards,
Robert

It is possible to use byte code injection for .NET, competitors do this since ever.

No, currently we don't do any byte code injection - Since .NET Core there are very great built-in ways to get APM data, e.g. DiagnosticSource and we use that. This way the agent is just a simple library. We work on adding the agent without code modification, this PR has more details on it. We may do it later for legacy Full Framework applications and maybe we'll have some features based on byte code instrumentation. Like 10+ years ago byte code instrumentation was indeed more or less the only way to do it, but today the landscape is different.

Also byte code injection does not work everywhere - Mono has different features on it, or if you have short living processes then it adds high overhead. Also supporting every possible .NET Flavors with byte code injection is very hard. Another issue is potential errors caused by invalid byte code - it's not trivial to 100% test it. Plus we are an open source agent and community contribution is also more likely with a pure C# code base which is a library. So with avoid that we can ship faster.

It is almost never feasible or even possible to manually instrument .NET applications.

I feel this is a very opinionated statement - of course there can be cases when you can't instrument an app manually, but we have many many users doing this. Do you have any specifics here? I think the general statement that "almost never...possible to manually instrument .NET applications" is absolutely not true. Do you have any data or reference on this?

I used to use a competitor's product to instrument closed source .NET applications. Plus, customers are not willing or able to change there productive applications.

Best regards,
Robert

Sometimes it is better to manually instrument code than using auto instrumentation. Auto instrumentation is a bit like "one size fits all" - or better "one size fits most". Even if auto instrumentation is supported for the Java agent we do not use it in all cases because the manual instrumentation gives us more freedom...

That being said I do not want to say that the auto instrumentation isn't good - not at all! It is a great starting point where we can get a lot with a minimum of work - but both ways have their benefits and drawbacks.

Best regards
Wolfram

Thanks for that feedback @Wolfram_Haussig that's exactly how I'm thinking about it too.

Are you using manual instrumentation on top of the auto instrumentation or are you disabling auto instrumentation in some cases?

I have both usecases: In most cases I use the autoinstrumentation and enhance that data with informations from our code(like IDs or other interesting data).

In a single usecase I have the problem that APM would generate a single transaction for a Job while the job itself runs one or more jobs(it reads jobs from a database job table and executes jobs until none are left) so I have disabled the Quartz job for this usecase and generate the transactions for each job. All other Usecases work fine with the Quartz logic

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.