Spans are sent after custom transaction is closed (Node.js APM agent)

Hi, I'm trying to use APM in a small module and running into the following problem - I start and close a transaction manually, but spans (both custom and from instrumented MongoDB query) are sent after the transaction is closed.

Here is code sample:

// some imports 

const transaction = apmWrapper.agent.startTransaction('test-apm-2', 'job');
const span = transaction.startSpan('my-span', 'task');

MongoDb
  .connect()
  .then(() => UserCollection.find({}).limit(1).toArray())
  .then(([user]) => logger.warn(user.nick))
  .then(() => {
    span.end();
    transaction.end('done');
    return apmWrapper.flush(); // wraps callback into a promise
  })
  .then(() => ShutdownHelper.exit(0))
  .catch(e => {
    logger.error('Rating epic fail', e);
    ShutdownHelper.exit(1);
  });

So I'd expect all spans to be closed and sent before the transaction is closed, as soon as they doesn't return any promise to wait on, so I presume they work synchronously.

But here is what I got in the debug log:

Debug log
adding hook to Node.js module loader
shimming mongodb-core@3.1.9 module
shimming mongodb-core.Server.prototype.command
shimming mongodb-core.Server.prototype functions: [ 'insert', 'update', 'remove', 'auth' ]
shimming mongodb-core.Cursor.prototype functions: [ '_find', '_getmore' ]
shimming bluebird@3.5.1 module
shimming bluebird.prototype functions: [ '_then', '_addCallbacks' ]
shimming bluebird.config
start trace { trans: '9f24bea44d26b2f5',
  parent: undefined,
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'test-apm-2',
  type: 'job' }
start span { span: '7fefd10fc85d38e2',
  parent: '9f24bea44d26b2f5',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'my-span',
  type: 'task' }
intercepted call to mongodb-core.Cursor.prototype._find { id: '9f24bea44d26b2f5' }
start span { span: '27125da8ceb6754d',
  parent: '7fefd10fc85d38e2',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'speke-local-1.users.find',
  type: 'db.mongodb.query' }
intercepted mongodb-core.Cursor.prototype._find callback { id: '9f24bea44d26b2f5' }
ended span { span: '27125da8ceb6754d',
  parent: '7fefd10fc85d38e2',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'speke-local-1.users.find',
  type: 'db.mongodb.query' }
encoding span { span: '27125da8ceb6754d',
  parent: '7fefd10fc85d38e2',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'speke-local-1.users.find',
  type: 'db.mongodb.query' }
warn: SomeUserNick
ended span { span: '7fefd10fc85d38e2',
  parent: '9f24bea44d26b2f5',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'my-span',
  type: 'task' }
encoding span { span: '7fefd10fc85d38e2',
  parent: '9f24bea44d26b2f5',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'my-span',
  type: 'task' }
setting transaction result { trans: '9f24bea44d26b2f5',
  parent: undefined,
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  result: 'done' }
sending transaction { trans: '9f24bea44d26b2f5',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6' }
ended transaction { trans: '9f24bea44d26b2f5',
  parent: undefined,
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  type: 'job',
  result: 'done',
  name: 'test-apm-2' }
sending span { span: '7fefd10fc85d38e2',
  parent: '9f24bea44d26b2f5',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'my-span',
  type: 'task' }
sending span { span: '27125da8ceb6754d',
  parent: '7fefd10fc85d38e2',
  trace: '731b9ee10e9b6e2cb33983ce6c486cb6',
  name: 'speke-local-1.users.find',
  type: 'db.mongodb.query' }

As you can see at the end of the log, spans are sent after the transaction is closed, so I see an empty transaction in the UI, without any spans.

And if I just add a small 10ms timeout before ending the transaction, it all works perfectly fine.

.then(() => new Promise(resolve => setTimeout(resolve, 10)))
.then(() => {
  transaction.end('done');
  return apmWrapper.flush();
})

But it does seem like a wrong way.
Could anyone please explain why this is happening and maybe advise a correct way of achieving my goal?

1 Like

Hi,

Thanks for trying out Elastic APM and the Node.js agent. And thanks for the detailed error report.

After ending a span there's a little bit of asynchronous work (processing stack traces and source lines) that needs to be completed before it can be sent. So it's expected that the span might be sent after the transaction in your case.

Elastic APM doesn't care however in which order the transactions and spans arrive at and is stored in Elasticsearch. So it shouldn't affect the "span waterfall" when viewing the transaction in Kibana.

But you say that you see an empty transaction in the UI, without any spans, which does sound odd. Have you tried to query for the missing spans in Discover?

/thomas

1 Like

Hey, thanks for the fast reply :slight_smile:

I checked and wasn't able to find missing spans via Discover, and I guess I know why.
Point is, the script stops its work right after the flush.
ShutdownHelper.exit(0) is actually a wrapper for process.exit(0), I should've clarified that earlier.

So, when flush happens, Mongo-related span is not ready yet. And then the program stops, so the span doesn't have a chance to be sent.

So when I add a small timeout before the flush() call - it works.
But timeouts still seem like a pretty bad practice :slight_smile:

Is there, maybe, a way to know when all spans are ready to be sent?

Ah yes, that would explain it.

If you're going to exit the process and want to make sure that all data have been sent to the APM Server, you can manually call agent.flush().

If you give this function a callback as the first argument, it will call the callback once all data have been sent to the APM Server.

But I do call flush() :slight_smile:

This line here calls it, it just wrapped in a promise:

return apmWrapper.flush(); // wraps callback into a promise

And it sends all available at the moment data to the server, but some spans are just not ready yet.
Problem is, I don't know if all spans finished their work, so it's like:

- Transaction starts
- Spans start
- ...
- Some spans end, some doesn't <-- I can see in UI those which ended at this point
- Transaction ends
- Flush gets called
- Remaining spans end <-- and those are not sent to server at all
- Process exits

So as soon as I can't tell if all spans are ready, I can't know when to call flush() to guarantee all spans are sent to the server..

Or do I miss something?

Oh, sorry I missed that.

The flush function doesn't return a promise, so you can't just return the result of calling it at the end of a then block. This will not magically wrap it in a promise. You'd have to promisify it first.

That being said, you are completely right. If an ended span is still in the process of being encoded, the flush function will not know about it and it will call its callback before the span gets a chance to be sent to the APM Server.

I can see this is a bit confusing as the async nature of ending a span isn't apparent. What is your use-case for wanting to exit the process right after ending a span?

To be crystal clear - I do promisify the flush call :slight_smile:

apmWrapper.flush() stands for:

  /**
   * @return {Promise}
   */
  flush() {
    return new Promise((resolve, reject) =>
      this.agent.flush(err => err ? reject(err) : resolve()));
  }

As for my use-case - I have a relatively small script (as one of the components of my system), which is run periodically by cron. So once it's done its work, it exits (until the next cron run).

I'm trying to gather some metrics from it via Elastic APM to have it all in one place.

Ah, I see. I haven't had my coffee this morning :smile:

I've created an issue on GitHub so this experience can be improved: https://github.com/elastic/apm-agent-nodejs/issues/798

2 Likes

Thanks a lot!
I'll keep track on that issue :slight_smile:

Meanwhile, I've got a couple more questions about span waterfall and distributed tracing, but I guess those are matters for another topics :slight_smile:

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.