Trying to understand APM Architecture -- why a dedicated APM Server process?

ninapavlich · January 31, 2018, 6:51pm

Hello! I've been setting up a Elastic Stack over the last few months to monitor infrastructure (using MetricBeat + Logstash on my clients, sending that data to Elasticsearch + Kibana).

I'm interested in expanding the monitoring capability to include APM on our Django applications and Python workers.

When looking at the API set up for Django, it seems like a custom APM middleware will send data to an APM server, which in turn will process that data and store it in Elasticsearch. I'm wondering, what is the architectural reason for having this dedicated APM Server process? My team and I are trying to reduce the footprint of our monitoring infrastructure, and we are very weary about adding an extra process to the stack.

The way I understand MetricBeat and Logstash to work is that these applications run on the client machine and send data directly to Elasticsearch; then using Kibana dashboards I can view the data.

My question is, what is the added value that the dedicated APM server brings? Why couldn't we just send the APM data directly from the client to Elasticsearch like we do for Logstash and Metricbeat? (I assume this was a well-thought out decision and I'm just looking to the rationale so I can relay this to the team!)

roncohen · January 31, 2018, 7:38pm

Hi Nina,

Thank you for your interest in Elastic APM!

The APM Server is a separate process for a couple of different reasons:

In general, we try to keep agents as light as possible. Any heavy processing agents needs to do will affect performance of your app. We prefer to offload the heavy lifting to APM Server, which can be scaled independently.
With real-user-monitoring, data is being collected in browsers and we typically don't want browsers talking to Elasticsearch directly.
Having the agents talk to an APM Server makes it easier to centrally control the amount of data that flows into Elasticsearch, the pace at which it gets ingested etc.
APM Server will buffer up the data if Elasticsearch is momentarily down. Doing this buffering in agents would mean the memory footprint for your application ballooning. We can't rely on disk access being available to the agents.
Source mapping for javascript in the browser is best suited to be handled at a middle layer like APM Server.
The APM Server exposes a simple JSON API to agents which allows us to better maintain compatibility across different versions of agents and Elasticsearch.

Like source mapping, we have more features on the drawing board that would likely fit poorly in the agents.

We also investigated the possibility of building much of the APM Server functionality as an Elasticsearch plugin. This came with the drawbacks of potentially inflating the Elasticsearch server process memory usage or the plugin spending other resources that were otherwise needed for Elasticsearch. Additionally, being independant of the plugin API would allow us to move faster.

Hope that helps!

wassim.dhib · February 14, 2018, 1:16pm

Hi,

I understand the need to have a separate process to handle heavy processing and deal with Elasticsearch.

But why not doing apm things as a logstash filter ?

Something like : http-input / apm-filter / es-output

system · March 7, 2018, 9:16am

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
APM for monitoring ELK applications APM python	8	454	March 5, 2023
Monitoring the memory use of our application APM	5	2306	March 24, 2018
APM Server Offline Retention / Synchronization APM	4	740	July 17, 2018
Getting APM metrics using the/an API APM python , ruby	11	1554	January 12, 2022
Python Agent Performance APM	4	796	September 26, 2018

Trying to understand APM Architecture -- why a dedicated APM Server process?

Related topics