Less number of active ML Node in anomaly detection jobs

Hi,

I have 5 nodes elasticsearch cluster and for all of them i activated ML roles.
from GET _nodes i can see ml role for each of the nodes, however in the anomaly detection jobs i saw only 4 active ML nodes


      "roles" : [
        "data",
        "ingest",
        "ml",
        "remote_cluster_client",
        "transform"
      ],

what can i check to see what is wrong ?

Hi @heric,

Just to clarify in case the naming is confusing, the "Active ML nodes" counter does not reflect the number of nodes with the "ml" role assigned. It is based on the running jobs on certain nodes, hence called "Active". So it means there are no active jobs allocated on one of your ml nodes.

Hi @darnautov , Thank you for the clarification.

I will add more jobs to see if the active ML nodes increase.

Is there any way i can check which jobs is assigned to which nodes ?

@heric, yes, using UI you can expand a row on the "Anomaly detection jobs" page and see the node name

You can also refer to the anomaly detection job statistics API.

1 Like

Example of using the API

contents of list_all_ml_jobs_nodes.sh:

#!/bin/bash
HOST='localhost'
PORT=9200
#CURL_AUTH="-u elastic:changeme"
#get list of all jobs
list=`curl $CURL_AUTH -s http://$HOST:$PORT/_ml/anomaly_detectors?pretty | awk -F" : " '/job_id/{print $2}' | sed 's/\",//g' | sed 's/\"//g'`
echo "job_id,node"
#loop through all jobs to find which node it is running on
while read -r JOB_ID; do
   curl $CURL_AUTH -s -XGET $HOST:$PORT/_ml/anomaly_detectors/${JOB_ID}/_stats?pretty |  jq '{job_id: .jobs[0].job_id, node: .jobs[0].node.name}' | jq --raw-output '"\(.job_id),\(.node)"'
done <<< "$list"

Would yield an output something like:

%> ./list_all_ml_jobs_nodes.sh
job_id,node
aaaaaaa,richs-mbp.lan
authd,null
bbbbb,richs-mbp.lan
bot_detection,null
...

where the first column is the job_id and the second is the node name (null if not running)

1 Like

After adding more Anomaly Job

Thank you @darnautov @richcollier

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.