Can we use the APM UI without injestion via opentracing or APM Server and use logs instead?

Morgan_Brickley · May 31, 2019, 2:11pm

I know that the expected way to ship data to APM is using the agent or the HTTP api and that the apm server transforms the requests into an Elastic search document the format of which is described here https://www.elastic.co/guide/en/apm/server/6.1/generated-docs.html

We already ship JSON logs to Elastic search and would like to use the APM UIs to visualize HTTP requests/responses and distributed transactions etc. however we use a mix of erlang and python and would prefer not to use the (python) agent or write our own HTTP emitters. Would it be possible to emit JSON logs in the transformed format since these are already being shipped to our ES cluster ? This would greatly ease the integration process and alleviate concerns around the potential impact of agents (ours or community ones) in production.

digitalron · May 31, 2019, 2:48pm

My recommendation here is to send data to an APM Server and let it be the one to forward it to Elasticsearch, to take advantage of ingest capabilities like some form of buffering, pipeline management, security (Elastic only has to talk to the APM server, no need to open it wide to an unnecessary network). You can opt to not use an agent and just send the minimum payload you can but that means

You take care of trace and transaction ID generation and propagation
You implement some form of payload management in your erlang and python
You have to routinely look out for bug and security fixes and implement those yourself

felixbarny · May 31, 2019, 3:54pm

What are your concerns regarding the Elastic APM Python agent? How would logging traces to a file alleviate them?

Morgan_Brickley · June 4, 2019, 5:32pm

We use two custom frameworks based on tornado - one using python2.7 and one using python3.6 so we would likely have to use either https://github.com/laerteallan/apm-agent-python-tornado or refer to the PR to add tornado to the official python agent library. However, bigger than the integration effort is the risk that tracing would impact production requests for some unknown reason - we have modified tornado itself to support proxies with AQM so it is not standard and we'd need to be careful when adding a background http request to the ioloop that it was bounded in terms of performance and did not affect our custom backpressure / AQM code. In general the company is risk adverse and we would need time to loadtest the impact on these apps to 'prove' out the cost of the agent could be controlled.

I see there is a failsafe to disable instrumentation, which is great. I think we'd also appreciate limits on data rates per instance of the agent ( in addition to the sampling % )

Morgan_Brickley · June 4, 2019, 5:34pm

The logging approach I mentioned has it's own issues but it can be evaluated on paper, before we do an integration, since we can calculate the logging load required.

axw · June 5, 2019, 4:56am

In the end the APM data is, as we say, "just another index". In theory, you could write a Filebeat or Logstash module to ingest your JSON logs and transform them into the APM format to Elasticsearch.

If you did use apm-agent-python-tornado or similar: the Python agent has a configurable transport, which you could implement in such a way that it logs to disk rather than sends over HTTP. You would need to take care of buffering to ensure that logging doesn't then introduce bottlenecks to your application, which can quite easily happen in a hot code path. Also, I don't think the transport interface is part of the agent's stable API, so it could make upgrades a little more troublesome.

I don't know of anyone doing this, so I can't provide any references unfortunately.

I see there is a failsafe to disable instrumentation, which is great. I think we'd also appreciate limits on data rates per instance of the agent ( in addition to the sampling % )

There are a couple of things that we've been investigating here which I think will help out here:

agent-side aggregation for non-sampled transactions
rate-limit configuration for sampling, to enable you to specify sampling in terms of transactions per second

Together these would mean the amount of data sent would be proportional to your sampling rate, rather than the transaction rate. Is this along the lines of what you had in mind?

system · June 26, 2019, 1:04am

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sending APM logs without using APM server APM dotnet	5	1121	October 27, 2019
What Kind of logs will push to ES via APM APM	8	695	May 7, 2019
Any plans to support java logback transport to es in java apm agent APM	3	586	January 31, 2019
Node application logging APM	1	433	May 15, 2018
How to trace non web transactions APM java , python	6	1300	December 3, 2019

Can we use the APM UI without injestion via opentracing or APM Server and use logs instead?

Related topics