Is it possible to redact parts of the captured body in NodeJS APM?

pocketcolin · June 12, 2023, 6:41pm

I'm using the NodeJS APM agent for capturing activity on my server. I currently have the APM init configured with captureBody: 'errors' which is great for debugging, but it doesn't seem to redact anything which means that when an error is thrown due to an invalid password, the password gets logged to our ELK instance. That's definitely not ideal. Is there a way to add a middleware or hook into the body capturing functionality to redact parts of the body or not include any body depending on the URL?

Elastic Cloud
Kibana version: 8.8.1
Elasticsearch version: 8.8.1
APM Agent language and version: NodeJS 3.46.0

trentm · June 12, 2023, 11:18pm

Hi @pocketcolin,

It is not obvious at all, but sanitization/redaction of fields in a captured incoming HTTP request body will be done only for form data -- i.e. requests with Content-Type: application/x-www-form-urlencoded. This is mentioned at Configuration options | APM Node.js Agent Reference [3.x] | Elastic. I'm not sure if it is discussed elsewhere in the docs. I've opened `captureBody` docs don't mention the conditions for sanitization/redaction · Issue #3426 · elastic/apm-agent-nodejs · GitHub to add mention of this in the captureBody docs.

Take this example. (I'm using the apm.addTransactionFilter() to conveniently dump the transaction data before it is sent on the the APM server. This is also foreshadowing.

// capturebody.example.js
const apm = require('elastic-apm-node').start({
  // serverUrl: '...',
  // secretToken: '...',
  serviceName: 'capturebody-example',
  apiRequestTime: '2s',
  metricsInterval: '0s',
  captureBody: 'all'
})

apm.addTransactionFilter(trans => {
  console.log('about to send this transaction: ', trans)
  return trans
})

const http = require('http')
const bodyParser = require('body-parser')
const express = require('express')

const app = express()
app.use(bodyParser.json())
app.use(bodyParser.urlencoded({extended: false}))
app.post('/ping', function (req, reply) {
  console.log(`received request: ${req.method} ${req.url}, content-type:${req.headers['content-type']}`)
  reply.send({ping: 'pong'})
})
app.listen({ port: 3000 }, async () => {
  console.log('listening at <http://127.0.0.1:3000/ping>')
})

If we run that and then POST to it with form data:

% curl -v http://127.0.0.1:3000/ping -X POST -d foo=bar -d passwd=secret
...
> POST /ping HTTP/1.1
> Content-Type: application/x-www-form-urlencoded
...

Then the transaction data shows that the "secret" field -- which matches one of the default patterns in Configuration options | APM Node.js Agent Reference [3.x] | Elastic -- is redacted:

% node capturebody.example.js
...
received request: POST /ping, content-type:application/x-www-form-urlencoded
about to send this transaction:  {
  id: 'a26c10c366d34a8e',
  trace_id: '841e9449f6dd84158bd459baaeb6c3bf',
  parent_id: undefined,
  name: 'POST /ping',
  type: 'request',
  duration: 3.551,
  timestamp: 1686610832961007,
  result: 'HTTP 2xx',
  sampled: true,
  context: {
    user: {},
    tags: {},
    custom: {},
    service: {},
    cloud: {},
    message: {},
    request: {
      http_version: '1.1',
      method: 'POST',
      url: [Object],
      headers: [Object],
      socket: [Object],
      body: '{"foo":"bar","passwd":"[REDACTED]"}'
    },
    response: { status_code: 200, headers: [Object] }
  },
  span_count: { started: 0 },
  outcome: 'success',
  faas: undefined,
  sample_rate: 1
}

However, if we POST with a JSON content-type:

% curl http://127.0.0.1:3000/ping -X POST -H content-type:application/json -d '{"foo":"bar","passwd":"secret"}'

Then there is no redaction:

received request: POST /ping, content-type:application/json
about to send this transaction:  {
  id: 'dabeae3b13849dd7',
  trace_id: '764730e22d8a60f4428aeba274681f59',
  parent_id: undefined,
  name: 'POST /ping',
  type: 'request',
  duration: 14.57,
  timestamp: 1686610819372043,
  result: 'HTTP 2xx',
  sampled: true,
  context: {
    user: {},
    tags: {},
    custom: {},
    service: {},
    cloud: {},
    message: {},
    request: {
      http_version: '1.1',
      method: 'POST',
      url: [Object],
      headers: [Object],
      socket: [Object],
      body: '{"foo":"bar","passwd":"secret"}'
    },
    response: { status_code: 200, headers: [Object] }
  },
  span_count: { started: 0 },
  outcome: 'success',
  faas: undefined,
  sample_rate: 1
}

As I hinted above, you could use apm.addTransactionFilter(fn) to filter as you require for your application. Something like this:

apm.addTransactionFilter(trans => {
  if (trans?.context?.request?.body) {
    try {
      const body = JSON.parse(trans.context.request.body)
      if ('passwd' in body) {
        body.passwd = '[REDACTED]'
      }
      trans.context.request.body = JSON.stringify(body)
    } catch (_err) {
      // pass
    }
  }
  // console.log('about to send this transaction: ', trans)
  return trans
})

pocketcolin · June 13, 2023, 2:49pm

Thanks, Trem! That is exactly what I was looking for. And that's very interesting to see that request.body is a stringified JSON object because that transaction param is unfortunately just typed as { [propName: string]: any }.

trentm · June 13, 2023, 4:00pm

Yah, we don't have a strong TypeScript typing of the payload.

FWIW, if it helps, we have JSON Schema definitions for those payloads once they are serialized to JSON. The entry for the request body for transactions is here: apm-agent-nodejs/test/integration/api-schema/apm-server-schema/transaction.json at main · elastic/apm-agent-nodejs · GitHub

pocketcolin · June 13, 2023, 8:09pm

@trentm question - does captureBody not capture 4xx level error bodies? I'm seeing redacted bodies on my error transactions for some 400 errors.

trentm · June 13, 2023, 8:31pm

@pocketcolin I think this'll be confusion on whether this is a captured request body on a "transaction" APM event or on an "error" APM event. You had configured the APM agent with captureBody: 'errors' so you should only be seeing a captured request body for an "error" APM event.

On "transaction" APM events, you should expect to see transaction.context.request.body = '[REDACTED]'. (I find this is somewhat confusing. It isn't so much that the body was captured, and then redacted because it was sensitive data. Rather the body was just not captured. I think having that field just not set in the captured data would be clearer for this case.)

So, for your case, I think the question is: Why don't you see "error" APM events for incoming HTTP requests that result in a 4xx response statusCode. Ultimately this will depend on the instrumentation for the web framework you are using, but in general a 4xx statusCode is not considered an error from the server-side, because it is a client error. At least that is the decision in our APM agent specs. From https://github.com/elastic/apm/blob/main/specs/agents/tracing-transactions.md#transaction-outcome

"failure": Indicates that this transaction describes a failed result.
Note that client errors (such as HTTP 4xx) don't fall into this category as they are not an error from the perspective of the server.

system · July 11, 2023, 8:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Apm Request Body Redacted APM java	3	489	April 15, 2024
Not able to see request body for errors in NodeJS elastic-apm agent for express framework APM nodejs	5	1119	July 25, 2023
Elastic APM sanitization on message body APM	8	2034	August 5, 2020
APM JAVA/NodeJS - Capture Body Request for Backend connections APM java	3	411	September 15, 2021
How can i log http response body in apm APM	2	2939	July 23, 2018

Is it possible to redact parts of the captured body in NodeJS APM?

Related topics