Error creating machine learning job "autodetect"


(Richard Neely) #1

I get an error creating the most basic machine learning job. In kibana it says "could not open job an internal server error occured." In the elastic logs this is what it says. I can launch autodetect plugin from the cli just fine. Has anyone else ever seen this? Running 5.5 for everything.

[2017-08-28T10:29:15,871][ERROR][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/17563] [CDetachedProcessSpawner.cc@283] Cannot execute './autodetect': No such file or directory
[2017-08-28T10:29:15,872][ERROR][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/17563] [CCommandProcessor.cc@107] Failed to start process './autodetect'
[2017-08-28T10:29:26,214][ERROR][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] Failed to launch autodetect for job http_response_test
[2017-08-28T10:29:26,280][WARN ][o.e.x.p.PersistentTasksNodeService] [test-cld-es01] task job-http_response_test failed with an exception
org.elasticsearch.ElasticsearchException: Failed to launch autodetect for job http_response_test
at org.elasticsearch.xpack.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:47) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:116) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createAutodetectProcess(NativeAutodetectProcessFactory.java:76) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.create(AutodetectProcessManager.java:333) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$1.lambda$doRun$0(AutodetectProcessManager.java:277) ~[?:?]
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) ~[?:1.8.0_91]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$1.doRun(AutodetectProcessManager.java:276) ~[?:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.5.1.jar:5.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: java.io.FileNotFoundException: /tmp/autodetect_http_response_test_log_17488 (No such file or directory)
at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_91]
at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_91]
at java.io.FileInputStream.(FileInputStream.java:138) ~[?:1.8.0_91]
at java.io.FileInputStream.(FileInputStream.java:93) ~[?:1.8.0_91]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:297) ~[?:?]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper$PrivilegedInputPipeOpener.run(NamedPipeHelper.java:286) ~[?:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_91]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:139) ~[?:?]
at org.elasticsearch.xpack.ml.utils.NamedPipeHelper.openNamedPipeInputStream(NamedPipeHelper.java:106) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.ProcessPipes.connectStreams(ProcessPipes.java:140) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:112) ~[?:?]


(David Kyle) #2

Hi,

Can you tell me what OS you are using please?

Under the elasticsearch installation do you have a directory ES_HOME/plugins/x-pack/platform/<your platform>/bin where <your platform> is one of linux-x86_64, windows-x86_64 or darwin-x86_64. Can you list the contents? You should have a program called autodetect it needs to have execute permission.

What is the output from ES_HOME/plugins/x-pack/platform/<your platform>/bin/autodetect --help


(Richard Neely) #3

This runs on Centos 7.3. The plugin is there.

/usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/autodetect --help

Usage: autodetect [options] [+ [by ]]
Options::
--help Display this information and exit
--version Display version information and exit
--limitconfig arg Optional limit config file
--modelconfig arg Optional model config file
--fieldconfig arg Optional field config file
--modelplotconfig arg Optional model plot config file
--jobid arg ID of the job this process is associated with
--logProperties arg Optional logger properties file


(David Kyle) #4

Your original log message indicated that the autodetect program could not be found so good to know that it's present.

Does the user elasticsearch runs as have permission to execute a program in that directory, can you run ls -lh on the /bin and linux-x86_64 directories please.

How did you install elastic? Did you use the RPM installer or another method?


(Richard Neely) #5

Yes user has permission. It was installed via rpm.

total 464K
-rwxr-xr-x 1 elasticsearch elasticsearch 112K Aug 10 12:05 autoconfig
-rwxr-xr-x 1 elasticsearch elasticsearch 145K Aug 10 12:05 autodetect
-rwxr-xr-x 1 elasticsearch elasticsearch 80K Aug 10 12:05 controller
-rwxr-xr-x 1 elasticsearch elasticsearch 124K Aug 10 12:05 normalize


(David Kyle) #6

Communications between the elasticsearch JVM and the autodetect process is done via named pipes, these pipes are created in /tmp. In the log there is a message:

Caused by: java.io.FileNotFoundException: /tmp/autodetect_http_response_test_log_17488 (No such file or directory)

/tmp must be readable, if it is mounted with no_exec then autodetect cannot start.

/tmp is the default temporary directory the JVM uses but you can change this with the JVM option -Djava.io.tmpdir=/var/data/elasticsearch/tmp

Can you try adding that setting to ES_HOME/config/jvm.options without a suitable choice of tmpdir. This will required a restart of elasticsearch I'm afraid.


(Richard Neely) #7

There isn't anything special about mount tmp dir. It's mounted with all the default permissions for Centos 7. drwxrwxrwt. 14 root root 4096 Aug 30 13:37 tmp


(David Kyle) #8

Can you cat /etc/fstab please.

I noticed from the directory listing that autodetect was installed 10th August. Did ML work for you previously and then stopped or is that you hadn't tried ML before?


(Richard Neely) #9

I haven't tried it before. That date may correlate an upgrade to 5.5 from 5.4.

/etc/fstab

Created by anaconda on Tue Jul 12 11:42:31 2016

Accessible filesystems, by reference, are maintained under '/dev/disk'

See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info

/dev/mapper/centos-root / ext4 defaults,noatime,barrier=0 1 1
UUID=d0f3d578-2c58-49b7-8649-830b876231c1 /boot xfs defaults 0 0
/dev/mapper/centos-swap swap swap defaults 0 0
UUID=c6182de5-18d3-461f-866e-8efa0940f256 /srv/data ext4 defaults,noatime,nofail,barrier=0 0 0


(David Kyle) #10

Thanks for the output this is a curious problem.

Machine Learning starts a program called controller when elasticsearch starts up this manages the actual processes that do the analytics. Can you check if controller is running please with ps -ef | grep controller

If it is please run ls -l /proc/<controller-process-pid>/ where controller-process-pid is the PID of the process you found with the ps command.

Thanks


(Richard Neely) #11

[es01 ~]# ps -ef | grep controller
root 14285 14213 0 08:12 pts/0 00:00:00 grep --color=auto controller
elastic+ 17563 17488 0 Jul31 ? 00:00:00 /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller
[es01 ~]# ls -l /proc/17563
total 0
dr-xr-xr-x 2 elasticsearch elasticsearch 0 Aug 31 08:12 attr
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 autogroup
-r-------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 auxv
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 cgroup
--w------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 clear_refs
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:11 cmdline
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 comm
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 coredump_filter
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 cpuset
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 cwd -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin (deleted)
-r-------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 environ
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 exe -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller (deleted)
dr-x------ 2 elasticsearch elasticsearch 0 Aug 24 09:13 fd
dr-x------ 2 elasticsearch elasticsearch 0 Aug 31 08:12 fdinfo
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 gid_map
-r-------- 1 elasticsearch elasticsearch 0 Aug 24 09:13 io
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 limits
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 loginuid
dr-x------ 2 elasticsearch elasticsearch 0 Aug 31 08:12 map_files
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 maps
-rw------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mem
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mountinfo
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mounts
-r-------- 1 elasticsearch elasticsearch 0 Aug 31 08:12 mountstats
dr-xr-xr-x 6 elasticsearch elasticsearch 0 Aug 31 08:12 net
dr-x--x--x 2 elasticsearch elasticsearch 0 Aug 31 08:12 ns
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 numa_maps
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 oom_adj
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 oom_score
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 oom_score_adj
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 pagemap
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 personality
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 projid_map
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 root -> /
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 sched
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 schedstat
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 sessionid
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 setgroups
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 smaps
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 stack
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 24 09:13 stat
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 24 09:13 statm
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 06:04 status
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 syscall
dr-xr-xr-x 5 elasticsearch elasticsearch 0 Aug 31 08:12 task
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 timers
-rw-r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 uid_map
-r--r--r-- 1 elasticsearch elasticsearch 0 Aug 31 08:12 wchan


(David Kyle) #12

Thanks Richard you've managed to shine a light.

From the grep output
elastic+ 17563 17488 0 Jul31 ? 00:00:00 /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller

The start time is 31st July this is before the timestamp of the files in the linux-x86_64/bin directory

controller's current work directory seems to have been deleted:
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 cwd -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin (deleted)

and the same for the image
lrwxrwxrwx 1 elasticsearch elasticsearch 0 Aug 31 08:12 exe -> /usr/share/elasticsearch/plugins/x-pack/platform/linux-x86_64/bin/controller (deleted)

It looks like the upgrade process did not go cleanly and the old controller did not shut down properly. Can you try shutting down elasticsearch, killing controller kill 17563 and restarting elasticsearch please.

How do you perform the upgrade? That information would help me in tracking down the cause of this issue.


(Richard Neely) #13

That was it. I restarted all the nodes and it seemed to clear up. The install is managed through puppet, that was the only thing that didn't work right.


(David Kyle) #14

Great I'm happy you've got it working. Good luck with your anomaly detection!


(system) #15

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.