Can APM separate queue process and my own service's process?

Few days ago, my ES server's disk usage hit the reallocation threshold. After that my service can't connect APM(log show "503 queue is full") and my service return 504 timeout.
I have some questions on it:

  1. I read some document said both APM agent(for buffer request to APM server) and APM server(for buffer request to ES server) have a internal queue, right?

  2. What is the default queue type?

  3. Is that a setting can separate the queue process and my service's own process that can let my service still working and won't down when APM/ES server is downed?
    Or is that a setting can set a timeout if APM/ES server no response then let my service keep process itself?

Thank you very much!

Both the apm server and apm agents have internal queues to buffer events if they are unable to send them.

The apm-server uses a bounded FIFO queue. The agents implement their queues different; information can be found here: Common problems | APM User Guide [8.1] | Elastic

If the agents are unable to communicate with the apm-server, it should not interrupt your main service's process.

1 Like

Thanks Nelson,

But I'm really facing main service down(actually is 504 timeout) after the agent's queue is full...Is that any reason would trigger this?

This is my reproduce steps:

  1. systemctl stop es.service(sim es disk hit 95%)

  2. As my error sample rate is 100%, I send requests those would trigger error to my service repeatedly to make the queue full

  3. After queue is full, my service keep return 504

  4. As my normal sample rate is 20%, if the request won't send data to APM, it can process normally. But if it captured by APM, it would also return 504

Btw about the queue type, is it a blocking queue? And can I get the queue size from agent and clear the queue at runtime?

This topic was automatically closed 20 days after the last reply. New replies are no longer allowed.