Team, Is there anyway to monitor the ML jobs running and send alerts?
You can monitor ML jobs and send alerts using Watcher.
Some things you can look at:
- Job results - set thresholds for scores
- Job messages - look for warnings and error messages
- Job statuses - check if job status is
openedand memory status is