Error creating machine learning job "autodetect"

Richard_Neely · August 29, 2017, 1:46pm

I get an error creating the most basic machine learning job. In kibana it says "could not open job an internal server error occured." In the elastic logs this is what it says. I can launch autodetect plugin from the cli just fine. Has anyone else ever seen this? Running 5.5 for everything.

[2017-08-28T10:29:15,871][ERROR][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/17563] [CDetachedProcessSpawner.cc@283] Cannot execute './autodetect': No such file or directory
[2017-08-28T10:29:15,872][ERROR][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/17563] [CCommandProcessor.cc@107] Failed to start process './autodetect'
[2017-08-28T10:29:26,214][ERROR][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] Failed to launch autodetect for job http_response_test
[2017-08-28T10:29:26,280][WARN ][o.e.x.p.PersistentTasksNodeService] [test-cld-es01] task job-http_response_test failed with an exception
org.elasticsearch.ElasticsearchException: Failed to launch autodetect for job http_response_test
at org.elasticsearch.xpack.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:47) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:116) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createAutodetectProcess(NativeAutodetectProcessFactory.java:76) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.create(AutodetectProcessManager.java:333) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$1.lambda$doRun$0(AutodetectProcessManager.java:277) ~[?:?]
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) ~[?:1.8.0_91]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$1.doRun(AutodetectProcessManager.java:276) ~[?:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.5.1.jar:5.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: java.io.FileNotFoundException: /tmp/autodetect_http_response_test_log_17488 (No such file or directory)
at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_91]
at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_91]
at java.io.FileInputStream.(FileInputStream.java:138) ~[?:1.8.0_91]
at java.io.FileInputStream.(FileInputStream.java:93) ~[?:1.8.0_91]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:297) ~[?:?]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:286) ~[?:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_91]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:139) ~[?:?]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:106) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.ProcessPipes.connectStreams(ProcessPipes.java:140) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:112) ~[?:?]

dkyle · August 29, 2017, 3:32pm

Hi,

Can you tell me what OS you are using please?

Under the elasticsearch installation do you have a directory ES_HOME/plugins/x-pack/platform/<your platform>/bin where <your platform> is one of linux-x86_64, windows-x86_64 or darwin-x86_64. Can you list the contents? You should have a program called autodetect it needs to have execute permission.

What is the output from ES_HOME/plugins/x-pack/platform/<your platform>/bin/autodetect --help

Richard_Neely · August 29, 2017, 6:22pm

This runs on Centos 7.3. The plugin is there.

/usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/autodetect --help

Usage: autodetect [options] [+ [by ]]
Options::
--help Display this information and exit
--version Display version information and exit
--limitconfig arg Optional limit config file
--modelconfig arg Optional model config file
--fieldconfig arg Optional field config file
--modelplotconfig arg Optional model plot config file
--jobid arg ID of the job this process is associated with
--logProperties arg Optional logger properties file

dkyle · August 30, 2017, 7:33am

Your original log message indicated that the autodetect program could not be found so good to know that it's present.

Does the user elasticsearch runs as have permission to execute a program in that directory, can you run ls -lh on the /bin and linux-x86_64 directories please.

How did you install elastic? Did you use the RPM installer or another method?

Richard_Neely · August 30, 2017, 12:35pm

Yes user has permission. It was installed via rpm.

total 464K
-rwxr-xr-x 1 elasticsearch elasticsearch 112K Aug 10 12:05 autoconfig
-rwxr-xr-x 1 elasticsearch elasticsearch 145K Aug 10 12:05 autodetect
-rwxr-xr-x 1 elasticsearch elasticsearch 80K Aug 10 12:05 controller
-rwxr-xr-x 1 elasticsearch elasticsearch 124K Aug 10 12:05 normalize

dkyle · August 30, 2017, 4:47pm

Communications between the elasticsearch JVM and the autodetect process is done via named pipes, these pipes are created in /tmp. In the log there is a message:

Caused by: java.io.FileNotFoundException: /tmp/autodetect_http_response_test_log_17488 (No such file or directory)

/tmp must be readable, if it is mounted with no_exec then autodetect cannot start.

/tmp is the default temporary directory the JVM uses but you can change this with the JVM option -Djava.io.tmpdir=/var/data/elasticsearch/tmp

Can you try adding that setting to ES_HOME/config/jvm.options without a suitable choice of tmpdir. This will required a restart of elasticsearch I'm afraid.

Richard_Neely · August 30, 2017, 5:40pm

There isn't anything special about mount tmp dir. It's mounted with all the default permissions for Centos 7. drwxrwxrwt. 14 root root 4096 Aug 30 13:37 tmp

dkyle · August 30, 2017, 5:44pm

Can you cat /etc/fstab please.

I noticed from the directory listing that autodetect was installed 10th August. Did ML work for you previously and then stopped or is that you hadn't tried ML before?

Richard_Neely · August 30, 2017, 5:46pm

I haven't tried it before. That date may correlate an upgrade to 5.5 from 5.4.

/etc/fstab

Created by anaconda on Tue Jul 12 11:42:31 2016

Accessible filesystems, by reference, are maintained under '/dev/disk'

See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info

/dev/mapper/centos-root / ext4 defaults,noatime,barrier=0 1 1
UUID=d0f3d578-2c58-49b7-8649-830b876231c1 /boot xfs defaults 0 0
/dev/mapper/centos-swap swap swap defaults 0 0
UUID=c6182de5-18d3-461f-866e-8efa0940f256 /srv/data ext4 defaults,noatime,nofail,barrier=0 0 0

dkyle · August 31, 2017, 10:35am

Thanks for the output this is a curious problem.

Machine Learning starts a program called controller when elasticsearch starts up this manages the actual processes that do the analytics. Can you check if controller is running please with ps -ef | grep controller

If it is please run ls -l /proc/<controller-process-pid>/ where controller-process-pid is the PID of the process you found with the ps command.

Thanks

Richard_Neely · August 31, 2017, 12:14pm

[es01 ~]# ps -ef | grep controller
root 14285 14213 0 08:12 pts/0 00:00:00 grep --color=auto controller
elastic+ 17563 17488 0 Jul31 ? 00:00:00 /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller
[es01 ~]# ls -l /proc/17563
total 0
dr-xr-xr-x 2 elasticsearch elasticsearch 0 Aug 31 08:12 attr
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 autogroup
-r-------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 auxv
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 cgroup
--w------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 clear_refs
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:11 cmdline
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 comm
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 coredump_filter
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 cpuset
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 cwd -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin (deleted)
-r-------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 environ
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 exe -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller (deleted)
dr-x------ 2 elasticsearch elasticsearch 0 Aug 24 09:13 fd
dr-x------ 2 elasticsearch elasticsearch 0 Aug 31 08:12 fdinfo
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 gid_map
-r-------- 1 elasticsearch elasticsearch 0 Aug 24 09:13 io
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 limits
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 loginuid
dr-x------ 2 elasticsearch elasticsearch 0 Aug 31 08:12 map_files
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 maps
-rw------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mem
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mountinfo
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mounts
-r-------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mountstats
dr-xr-xr-x 6 elasticsearch elasticsearch 0 Aug 31 08:12 net
dr-x--x--x 2 elasticsearch elasticsearch 0 Aug 31 08:12 ns
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 numa_maps
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 oom_adj
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 oom_score
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 oom_score_adj
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 pagemap
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 personality
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 projid_map
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 root -> /
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 sched
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 schedstat
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 sessionid
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 setgroups
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 smaps
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 stack
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 24 09:13 stat
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 24 09:13 statm
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 06:04 status
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 syscall
dr-xr-xr-x 5 elasticsearch elasticsearch 0 Aug 31 08:12 task
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 timers
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 uid_map
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 wchan

dkyle · August 31, 2017, 12:38pm

Thanks Richard you've managed to shine a light.

From the grep output
elastic+ 17563 17488 0 Jul31 ? 00:00:00 /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller

The start time is 31st July this is before the timestamp of the files in the linux-x86_64/bin directory

controller's current work directory seems to have been deleted:
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 cwd -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin (deleted)

and the same for the image
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 exe -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller (deleted)

It looks like the upgrade process did not go cleanly and the old controller did not shut down properly. Can you try shutting down elasticsearch, killing controller kill 17563 and restarting elasticsearch please.

How do you perform the upgrade? That information would help me in tracking down the cause of this issue.

Richard_Neely · September 1, 2017, 12:26pm

That was it. I restarted all the nodes and it seemed to clear up. The install is managed through puppet, that was the only thing that didn't work right.

dkyle · September 1, 2017, 12:48pm

Great I'm happy you've got it working. Good luck with your anomaly detection!

system · September 29, 2017, 12:48pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ML job error : Failed to launch autodetect for job Kibana elastic-stack-machine-learning	7	1274	February 3, 2021
Unable to create ML job Elasticsearch elastic-stack-machine-learning	6	3147	October 12, 2018
Error Machine Learning Job in Elastic Cloud Enterprise "autodetect" Elasticsearch elastic-stack-machine-learning , docker	13	1185	February 10, 2021
ML problem Elasticsearch elastic-stack-machine-learning	2	371	January 10, 2023
Can not create Machine learning job in Kibana Kibana elastic-stack-machine-learning	2	556	October 5, 2020