Troubleshooting with machine learning

The Machine Learning feature is implemented using native processes that run outside of the JVM that runs Elasticsearch. In your case the native process called autodetect cannot be started. Usually this process would be started by another native process called controller, which is usually started when Elasticsearch is started. I can reproduce your sequence of error messages if I kill this controller process and then try to open a job.

Please can you check what happens if you try and run the relevant native processes at the command prompt:

$ES_HOME/plugins/x-pack/platform/linux-x86_64/bin/controller --version
$ES_HOME/plugins/x-pack/platform/linux-x86_64/bin/autodetect --version

Do either of these commands complain about missing OS libraries or other OS-related problems?

If not, the most likely scenario is that your controller process has been killed. Do you see it in the process list if you run ps -e | grep controller?

How long has this node been running for? It would be helpful if you could check all the logs since the node started for messages containing either controller or CppLogMessageHandler. Something like:

egrep 'controller|CppLogMessageHandler' $ES_HOME/logs/*log

Also, I will raise an issue for friendlier error reporting in the case where the controller process is not running for some reason. The way it's reported at the moment is impossible to understand.

2 Likes