Unable to create ML job

machine-learning

(Pradip Das) #1

Hi Team,

I am using Elasticsearch 6.4.0 and my beats are sending log/ event directly to ES. But when i am trying to create a Machine Learning job its giving me below error.

Elasticsearch output:

[2018-09-12T17:33:40,521][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [J14-npm] Opening job [testml]
[2018-09-12T17:33:40,534][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [J14-npm] [testml] Loading model snapshot [N/A], job latest_record_timestamp [N/A]
[2018-09-12T17:33:40,542][ERROR][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] Failed to launch autodetect for job testml
[2018-09-12T17:33:40,589][WARN ][r.suppressed ] path: /_xpack/ml/anomaly_detectors/testml/_open, params: {job_id=testml}
org.elasticsearch.ElasticsearchException: Unexpected job state [failed] while waiting for job to be opened
at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:34) ~[?:?]
at org.elasticsearch.xpack.ml.action.TransportOpenJobAction$JobPredicate.test(TransportOpenJobAction.java:814) ~[?:?]
at org.elasticsearch.xpack.ml.action.TransportOpenJobAction$JobPredicate.test(TransportOpenJobAction.java:774) ~[?:?]
at org.elasticsearch.persistent.PersistentTasksService.lambda$waitForPersistentTaskCondition$2(PersistentTasksService.java:169) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.clusterChanged(ClusterStateObserver.java:186) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateListeners$7(ClusterApplierService.java:506) ~[elasticsearch-6.4.0.jar:6.4.0]
at java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3527) [?:1.8.0_181]
at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743) [?:1.8.0_181]
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
[2018-09-12T17:33:40,596][WARN ][o.e.p.PersistentTasksNodeService] [J14-npm] task job-testml failed with an exception
org.elasticsearch.ElasticsearchException: Failed to launch autodetect for job testml
at org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:38) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:104) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createAutodetectProcess(NativeAutodetectProcessFactory.java:60) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.create(AutodetectProcessManager.java:504) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.createProcessAndSetRunning(AutodetectProcessManager.java:458) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager.access$800(AutodetectProcessManager.java:90) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$2.doRun(AutodetectProcessManager.java:427) ~[?:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) ~[elasticsearch-6.4.0.jar:6.4.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.4.0.jar:6.4.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: java.nio.file.NoSuchFileException: /tmp/elasticsearch.W0dkkJp9/limitconfig774444647943567679.conf
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[?:?]
at java.nio.file.Files.newByteChannel(Files.java:361) ~[?:1.8.0_181]
at java.nio.file.Files.createFile(Files.java:632) ~[?:1.8.0_181]
at java.nio.file.TempFileHelper.create(TempFileHelper.java:138) ~[?:1.8.0_181]
at java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:161) ~[?:1.8.0_181]
at java.nio.file.Files.createTempFile(Files.java:852) ~[?:1.8.0_181]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectBuilder.buildLimits(AutodetectBuilder.java:114) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectBuilder.build(AutodetectBuilder.java:103) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:99) ~[?:?]
... 10 more
Caused by: java.nio.file.NoSuchFileException: /tmp/elasticsearch.W0dkkJp9/limitconfig774444647943567679.conf
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[?:?]
at java.nio.file.Files.newByteChannel(Files.java:361) ~[?:1.8.0_181]
at java.nio.file.Files.createFile(Files.java:632) ~[?:1.8.0_181]
at java.nio.file.TempFileHelper.create(TempFileHelper.java:138) ~[?:1.8.0_181]
at java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:161) ~[?:1.8.0_181]
at java.nio.file.Files.createTempFile(Files.java:852) ~[?:1.8.0_181]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectBuilder.buildLimits(AutodetectBuilder.java:114) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectBuilder.build(AutodetectBuilder.java:103) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:99) ~[?:?]
... 10 more
[2018-09-12T17:33:40,838][INFO ][o.e.x.m.a.TransportPutDatafeedAction] [J14-npm] Created datafeed [datafeed-testml]


(rich collier) #2

I think this is probably the most important line - the autodetect process is the C++-based executable that runs the machine learning algorithms.

What operating system are you running on (uname -a)?


(Pradip Das) #3

Hi Rich,

I am using CentOS 7.5

Linux ipaysyslogsrv.ipay.local 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


(David Kyle) #4

This looks like very much like a issue found on system d linux's where the tmp directory was automatically cleaned up however, that was fixed in 6.4.0. Have you upgraded recently or is this a fresh install?

Can you check the directory /tmp/elasticsearch.W0dkkJp9/ exists and the user you run elasticsearch as has read + write + execute permission on that directory. If the directory is missing try re-creating it with the correct permissions i.e.

sudo mkdir /tmp/elasticsearch.W0dkkJp9
sudo chown elasticsearch_user /tmp/elasticsearch.W0dkkJp9
sudo chmod 700 /tmp/elasticsearch.W0dkkJp9

Then try creating a new job


(David Roberts) #5

This is the key error.

How did you install Elasticsearch? Did you use the .rpm installer or the .tar.gz installer?

Has this Elasticsearch node been running for more than 10 days before you attempted to open the ML job?

If you are using the .tar.gz installer and your Elasticsearch node has been running for more than 10 days then this advice is probably relevant.

If you are using the .rpm installer then we need to work out why the systemd private temp directory functionality we added in 6.4 didn't work as expected.


(Pradip Das) #6

Thanks for pointing out the issue. Its resolved now. i have created a custom folder and set $ES_TMPDIR variable with the folder path and its works.

Thanks you very much for your help.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.