Feature: Metricbeat for showing statistics of nodes managed by chef.io

I use chef for automation. I want to build a dashboard where I can monitor the status of my nodes.

For example, I want to be able to see which nodes have been bootstrapped, when the last checked in, their uptime like in the chef server web UI.

Additionally, I would be great to collect the run duration of a chef-client's run when it runs automatically every x minutes or so. It would be useful as indicator of how long deployments are taking.

This is just an idea in the early stage. I'd really like to hear your feedback and opinions about it.

I wonder which of these stats are generic and which one are chef specific. Contributions to metribeat modules are always very welcome :wink:

which nodes have been bootstrapped:

How can you tell a node has been bootstrapped? Any kind of meta-data/file that might change over time?

last checked in, uptime like in the chef server web UI

What kind of API do you use? You ask the server or agent? metricbeat provides a generic HTTP module, but a curated one might be nice to have.

collect the run duration of a chef-client's run when it runs automatically every x minutes or so

Metricbeat system/process module queries active applications every now and then. For events on application start/stop (more detailed output), auditbeat might get you what you want. If the agent/server would provide this kind of information, it would be easier though.

Besides these information, some more stats and failures would be nice to have as well. Maybe also introduce a filebeat module shipping agent/server logs?
If software/licenses are updated, versions/dates would be nice to have as well. That is, give the user a chance to see the system is really in the expected state.

Sorry, I don't have time to think about a solution at the moment. So much stuff on my agenda. I just wanted to share my thoughts about what would be a nice feature

This topic was automatically closed after 21 days. New replies are no longer allowed.