Elastic-apm-node memory leak

Ever since I incorporated APM (elastic-apm-node) in some nodejs microservices they have developed memory leaks.

I have a small repo that shows a couple strange things that might be related...

You will need a MS Sql Server db connection string and an APM server token

In the trace you will notice that even though some spans are sent, not all are. Many times I only get one.

If you set count to 10 or higher you will also see one or more "MaxListenersExceededWarning: Possible EventEmitter memory leak detected." messages.

If you set the count to 4000 you will (hopefully) get:
"MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 infoMessage listeners added."
"MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 errorMessage listeners added."
and then ...
"FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory"

Kibana version: 6.7.1

Elasticsearch version: 6.7.1

APM Server version: 6.6.1

APM Agent language and version: elastic-apm-node 2.10.0

Steps to reproduce:

    First Case: not all spans sent and memory leak warning
  1. clone https://github.com/SamMaxwell/elastic-apm-test.git

  2. cd into the new folder

  3. npm install

  4. set ELASTIC_APM_SERVER_URL, ELASTIC_APM_SECRET_TOKEN, ELASTIC_APM_LOG_LEVEL env variables

  5. node . "some MS Sql Server db connection string" "some query" 10

  6. inspect the trace output

    (you should notice that not all spans are sent and you get a memory leak warning)

    Second Case: JavaScript heap out of memory
    
  7. delete the ELASTIC_APM_LOG_LEVEL env var

  8. node . "some MS Sql Server db connection string" "some query" 4000

  9. look at the FATAL ERROR

    Third Case: disable APM and everything works fine
    
  10. set NO_APM env var to anything (to disable APM)

  11. node . "some MS Sql Server db connection string" "some query" 4000

  12. notice how it works with no FATAL ERROR

Thanks for the report and sorry you're having issues.

I've cloned your repo and tried running it locally--I do get the emitter count warning, but no fatal error, even bumping it up all the way to 1,000,000. It's likely just caused by the closure leak caused by the event handlers though. It's due to a mismatch between adding an event handler and trying to remove it due to the function wrap in https://github.com/elastic/apm-agent-nodejs/blob/master/lib/instrumentation/index.js#L284-L299

Thanks for looking into this.

Maybe getting the FATAL error somehow involves the nature/structure of data returned from the query. I was returning between 1 and 100 rows from a table with more than 50 columns in my testing.

Are you suggesting I do something to the application logic?

It's possible, but no need to do anything right now. I've got an idea for a fix for the emitter leak in mind. I'll try to get that out hopefully later today. After that, give it a try and see if that fixes the fatal error too--I suspect it does, but we'll see.

I've got a pull request here, just running through the CI now. I'll let you know when I land and release that. https://github.com/elastic/apm-agent-nodejs/pull/1044

thanks!

I've got the fix out in 2.11.0. Give that a try and let me know how it goes. :slight_smile:

1 Like

It seems like 2.11.0 fixes the MaxListeners warning and the FATAL out of memory!

I still see that the SPANS are not being automatically created except for the first iteration.

  • set ELASTIC_APM_LOG_LEVEL to "Trace"
  • run the test for 10 iterations
  • look at the trace output
  • notice how the "start span"/"ended span"/"encoding span" appears on the first iteration, but not the 2nd through 10th iterations
  • if you look at the apm docs with kibana you will notice that the spans do not appear

It's a context loss issue. We use async_hooks to maintain context through an async execution tree, but it appears to not support thenables, which is an object that looks like a promise by having a "then" method, but is not actually a promise. Tedious seems to use thenables internally. After awaiting a thenable async_hooks loses track of what the async ids are. A way around it is to wrap it in a real promise before awaiting, like this:

await new Promise((...args) => {
  promiseLike.then(...args)
})

I opened a second topic Not all transactions/spans are sent from agent / stored in elastic and worse when using addLabels when I thought it might make more sense to do so.

I use knex in the application, not tedious. tedious is used by mssql, which is the client specified in the config sent to knex to get the query-builder.

what are you suggesting I wrap with a Promise?

Oh, sorry--I meant knex, not tedious. I'm suggesting you wrap your queries that you want to await in a real promise so async_hooks can propagate the context properly.

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.