Advice on Metrics


I am new to metrics beats concept and exploring options...

I would like to collect metrics from all HW/Application (For Eg.. HTTP Server, Database, etc...) in the Enterprise where there are hundreds of these.

What would be the best approach:

  1. Use Metrics Beats to call their specific IP and capture metrics. In this case, I need to configure all assets in my metrics beats. Can this scale?
  2. Configure each asset in the enterprise to send metrics with syslog (not sure if they all can do this periodically)
  3. Anything else?

I am looking for a scalable metrics collection into Elastic. What would you suggest and what configuration would be required for something at this size? (Separate metrics beat cluster?)

I have checked Kafka metrics beats. it does not give much info on Kafka, such as if pointer is moving, CPU, memory used, etc?


The recommended installation for Metricbeat is to have Metricbeat on each edge node so the metrics can be fetched from localhost. Then the Metricbeat instances connect to Elasticsearch. In your case it means you have as many Metricbeat instances as your have machines where your services are running.

For Kafka: Do you refer to the Kafka module in Metricbeat?

Thanks. If you have 1000 systems in the enterprise, and you need to upgrade metrics beat installed on each system? What would you do?

So what would be best approach for an enterprise scale (still metrics beat on each node)?

Yes I was referring MetricsBeat Kafka module?

If you have 1000 machines, I assume you have some automation like puppet / chef / xyz in place for deployment. I would use the exact same mechanism for updating all metricbeat instances.

Even if you have 100'000 machines, I still recommend metricbeat on each machine.

For your missing Kafka metrics: Would you mind opening a feature request for it on Github so we can track it. It would be great if you could specify the metrics you are missing and what you use them for. This help us to better understand how people use it.

For kafka metrics you might consider a mix of different modules in metricbeat. E.g. system module for process CPU and memory usage. Also monitor disk usage with system module. The kafka module in metricbeat only collects partition info (the consumer group metricset is currently in beta and has a known bug, collecting not all potential data). But one can make use of JMX or prometheus in metricbeat (metricbeat can poll metrics from prometheus agends), to collect metrics right from the JVM (e.g. on GC) and internal kafka components.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.