APM Server/Agent Deployment Questions

I just started a new job, and have been tasked with researching deploying APM on an existing ELK stack. I have not worked with this technology before, so I'm watching a lot of videos and reading a lot of documentation trying to get up to speed.
I have tried to find documentation on system requirements, but not really finding a thorough answer as far as enterprise deployment. So hopefully someone can point me in the right direction.
Our current environment consists of the following (All hosted on AWS):
3 master elasticsearch servers that hold the API /queries/cluster management.
3 data elasticsearch servers
2 Kibana servers/ with logstash and beats

So based on this environment, how many APM servers would be suggested? What about sizing/cpu/memory?

I'm sure I am going to have more questions.. just trying to find somewhere to start.

Hello @Alyson_Whitaker, glad to hear you're giving APM a spin. APM Server sizing is generally based on the kinds of applications you're instrumenting. This guide should help get you to an initial deployment. We recommend you enable monitoring to keep an eye on APM Server performance.

Good luck and keep the questions coming!

Thanks @gil for your response. Do you have any documentation that can give me an insight on how many APM servers I should have in my stack? I'm not sure whether I should match the amount of APM with our current elastic search servers.

Hi @Alyson_Whitaker,

As @gil mentioned, APM server sizing depends on the kind and number of applications that need to be instrumented.

Review how many applications that need to be instrumented, and how much load you are expecting from each server. I would say start out with setting up an APM server with medium size and instrument one of your application and monitor the following

  • Memory
  • CPU
  • Load
  • Failed Rate
  • Response Errors

Now you will have a better idea about how much load you might be dealing with, then you can determine the APM server sizes and number of instances

This guide will inform you about the performance for various instance types (512Mb vs 2Gb vs 8Gb). Also, see this note

Don’t forget that the APM Server is stateless. Several instances running do not need to know about each other. This means that with a properly sized Elasticsearch instance, APM Server scales out linearly.

Some of the important configuration options that you need to consider are

  • queue.mem.events
  • queue.mem.flush.min_events
  • output.elasticsearch.workers
  • output.elasticsearch.bulk_max_size

You can review some of the tuning options here.

While monitoring the APM server, you can review here for some common problems that you may encounter.

Good luck!