Best practices for preprocessing data and monitoring resource usage in predefined ML jobs (security:host)

This category relates to the Enterprise Search set of products - App Search, Site Search and Workplace Search.
If your question relates to core Elasticsearch functionality, please head over to the Elastic Stack > Elasticsearch category for assistance.

PS - Please delete this text before posting your topic

Question 1:
I am planning to use the predefined ML job from the security:host module. To avoid overloading the ML model with too much data, what would be the best approach in terms of data preprocessing?

  • Should I first store the data in a saved object before using it as the data source for the ML job?

  • Or would it be better to create an ingest pipeline?

  • Or should I prepare the data through a transforms job before feeding it into the ML model?

Question 2:
Does Elasticsearch/Kibana provide dashboards for ML jobs? If not, how can I best monitor and evaluate the metrics (e.g., resource consumption) of an ML job so I can properly assess the performance of my tests?