How to solve hard and soft limit machine learning jobs

Hello;

I have created a machine learning job to detect port scanner using the packetbeat-* index.

type of job: Population
Population field: destination.ip
metric: distinct count(destination port)
influencers: destination.ip and source.ip
bukcet span: 15min

it's working perfectly, but I am getting warning of hard limits AND soft_limit,

Job memory status changed to soft_limit; memory pruning will now be more aggressive
Job memory status changed to hard_limit; job exceeded model memory limit 23mb by 1.7mb. Adjust the analysis_limits.model_memory_limit setting to ensure all data is analyzed

I have a dedicated machine learninig node: 6 CPU and 8Go RAM

Could you please tell me how can I solve this warninigs !

Thanks for your help !

Except that it is not working perfectly - the job is "throwing away" data in order to keep the job from using too much memory.

By default, only 30% of your 8GB node (i.e. 2.4GB) is given for ML to use (see xpack.ml.max_machine_memory_percent at Machine learning settings in Elasticsearch | Elasticsearch Reference [7.12] | Elastic)

You should look to see how much memory is being used by other jobs (summing up the model_bytes for all running/open jobs) using

GET _ml/anomaly_detectors/_stats

If there's enough room, you can increase the memory limit on this job. See model_memory_limit at https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-update-job.html

If not, you need a bigger ML node.

1 Like

Thanks for your explanation @richcollier
Now I understand better how it works