SIEM ML Jobs Stuck | No Node Found to Open Job

hpicass0 · May 21, 2020, 11:59am

Hello Everyone,

I hope you are safe and well

I'm running a POC in Elastic Cloud, and I have an issue with ML jobs

I have some jobs stuck as you see below. That's under SIEM > Detections > ML Job Settings
They have been in this status for few days

I tried to find more details about the error and found below messages

No node found to open job. Reasons [persistent task is awaiting node assignment.]
No node found to open job. Reasons [Not opening job [rare_process_by_host_windows_ecs] because job memory requirements are stale - refresh requested]

Is that related to resource utilization? The ML node seems to be healthy

Thanks
Hosam

sophie_chang · May 21, 2020, 12:30pm

Hi @hpicass0

Your screenshot shows the SIEM job list. If you were to jump to the machine learning UI tab, then I suspect those jobs would be in a "opening" state. Based on the data that it is trying the analyse, ML has estimated that the memory required is not quite sufficient to open all the ML jobs on a single 1GB node.

Assuming you have multiple jobs in this "opening" state then I would first suggest that if you take a look in the Machine Learning job list. Make sure that jobs are closed for which you are not yet ingesting data (e.g. if no auditbeat data, then you won't need the auditbeat jobs).

By closing these, it should free up memory for other jobs to start.

If jobs still remain "opening", then you can free up space by closing jobs which have a lower relevance for you. You could also choose to run a subset of jobs in real-time and some in batch against a specified date range.

ML jobs model data in real time and holds this model in memory. The model size is determined by the characteristics of the input data. In general, the higher the cardinality the larger the memory required. As such, depending on your data, it may only be possible to concurrently run a subset of the SIEM jobs in real-time on the SIEM cluster.

If you are seeing other errors in the Machine Learning app, please let us know.]

Regards
Sophie

system · June 18, 2020, 12:30pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Machine Learning: Could not open job Elasticsearch elastic-stack-machine-learning	2	1533	November 12, 2017
Could not open job because no ML nodes with sufficient capacity were found Elasticsearch elastic-stack-machine-learning	16	6322	October 13, 2018
CLOUD TRIAL: Could not open job because no ML nodes with sufficient capacity were found [SOLVED] Elasticsearch elastic-stack-machine-learning	3	789	June 26, 2019
Elastic cloud basic setup: Could not open job because no ML nodes with sufficient capacity were found Elasticsearch elastic-stack-machine-learning	2	2876	March 26, 2019
Machine learning, No node found to open the job because job memory requirements are stale Elasticsearch	1	463	February 26, 2021

SIEM ML Jobs Stuck | No Node Found to Open Job

Related topics