Best practices for preprocessing data and monitoring resource usage in predefined ML jobs (security:host)

Question 1:
I am planning to use the predefined ML job from the security:host module. To avoid overloading the ML model with too much data, what would be the best approach in terms of data preprocessing?

  • Should I first store the data in a saved object before using it as the data source for the ML job?

  • Or would it be better to create an ingest pipeline?

  • Or should I prepare the data through a transforms job before feeding it into the ML model?

Question 2:
Does Elasticsearch/Kibana provide dashboards for ML jobs? If not, how can I best monitor and evaluate the metrics (e.g., resource consumption) of an ML job so I can properly assess the performance of my tests?