Not able to start machine learning job

SathishPrakasam · January 9, 2020, 4:36pm

We are configuring a ML job in our cluster.
After creating the ML job, tried to start the ML jobs and am getting below error,

Caused by: java.nio.file.NoSuchFileException: /tmp/elasticsearch-6776145032467540758/limitconfig1253236941500878778.conf
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[?:?]
at java.nio.file.Files.newByteChannel(Files.java:361) ~[?:1.8.0_171]
at java.nio.file.Files.createFile(Files.java:632) ~[?:1.8.0_171]
at java.nio.file.TempFileHelper.create(TempFileHelper.java:138) ~[?:1.8.0_171]
at java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:161) ~[?:1.8.0_171]
at java.nio.file.Files.createTempFile(Files.java:852) ~[?:1.8.0_171]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectBuilder.buildLimits(AutodetectBuilder.java:274) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectBuilder.build(AutodetectBuilder.java:182) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.NativeAutodetectProcessFactory.createNativeProcess(NativeAutodetectProcessFactory.java:109) ~[?:?]

Looks like it’s a known error, the solution suggested is to update the -Djava.io.tempdir to a constant value, after the change restarted master nodes to take effect.
In jvm process tempdir is updated as /tmp/elasticsearch-4034599267312088687.
However while starting the ML job its complaining a different directory elasticsearch-6776145032467540758/limitconfig1253236941500878778.conf is not available.

Can someone help me in cracking this issue?

SathishPrakasam · January 10, 2020, 10:08am

After updating ES_TMPDIR=/tmp/elasticsearch-4034599267312088687 i managed to get rid of the directory.
But everytime its looking for random limitconfig file For example first time when i started the job its looking in the directory /tmp/elasticsearch-4034599267312088687 for the file limitconfig4206780411356268755.conf.
Second time when i started the ML job its looking for the file limitconfig7936535717952102862.conf inside the directory /tmp/elasticsearch-4034599267312088687.

Can someone please help on this.

edsavage · January 15, 2020, 2:57pm

Hi Sathishkumar,

yes, I will pick this query up and ensure you receive a response to it shortly.

In the meantime could you please obtain, package up and provide the log files from every ML node in the cluster. Sending to me using a direct message on here is secure and a good way of doing this.

Best wishes,

Ed

system · February 12, 2020, 2:57pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to create ML job Elasticsearch elastic-stack-machine-learning	6	3153	October 12, 2018
Error Machine Learning Job in Elastic Cloud Enterprise "autodetect" Elasticsearch elastic-stack-machine-learning , docker	13	1212	February 10, 2021
ML job error : Failed to launch autodetect for job Kibana elastic-stack-machine-learning	7	1297	February 3, 2021
Unable to open Machine Learning job Elasticsearch elastic-stack-machine-learning	8	2011	October 29, 2018
Machine Learning: Could not create job Elasticsearch elastic-stack-machine-learning	5	1172	April 3, 2018

Not able to start machine learning job

Related topics