Elasticsearch not starting on one node after upgrading JDK from 1.8 to 12

Hi,

I have three node elasticsearch cluster. On one of the node I installed Rally and since rally was not running because it required jdk-12, I installed jdk-12 after which rally started running, but now my Elasticsearch is down on that node.

I am not getting any errors in the elasticsearch logs. I tried removing jdk 12 and installed previous jdk_1.8 but still it is not starting.Attaching logs

Any help in debugging is appreciated.Thanks!

Elasticsearch 6.5.4 doesn't support JDK12. The support matrix says it supports Oracle/OpenJDK
1.8.0u111+ and Oracle/OpenJDK 11.

Yes, I checked that later.

I have reverted back to JAVA 1.8 but still the node is down.Is there any way we can bring the node up, any debugging we can do?

One more question is that, If I want to install Rally on same node as ES and rally asks for JAVA 12 to start, what can be done?

Rally error:

[INFO] Preparing for race ...
[ERROR] Cannot race. ('JAVA_HOME points to JDK 8 but it should point to JDK 12.', None)

Thanks!

I see, sorry, I missed that in the original post. It looks like the ML controller process is stopping for some reason. "Terminated by signal 9" suggests something sent it a SIGKILL, but that seems surprising. Can you share more of the log, and share it as text rather than an image so it can be searched?

You can install multiple JVMs on a machine. Normally I think one would set the $JAVA_HOME environment variable to choose between them.

Thank for your reply

Yes, I can share the logs.Below is the error I got before ES node went down.I have restarted elasticsearch multiple times but it shuts down without giving any error in the logs.Is there any debug method to get more error information in logs when I start ES?

[2019-04-05T09:08:26,222][INFO ][o.e.x.w.a.l.ExecutableLoggingAction] [node-1] Hits exceeded 39304 1
[2019-04-05T09:18:26,465][INFO ][o.e.x.w.a.l.ExecutableLoggingAction] [node-1] Hits exceeded 39304 1
[2019-04-05T09:28:27,311][INFO ][o.e.x.w.a.l.ExecutableLoggingAction] [node-1] Hits exceeded 39304 1
[2019-04-05T09:38:28,013][INFO ][o.e.x.w.a.l.ExecutableLoggingAction] [node-1] Hits exceeded 39304 1
[2019-04-05T09:39:50,681][INFO ][o.e.x.m.j.p.a.NativeAutodetectProcess] [node-1] [ashnikcom_web_traffic_count] State output finished
[2019-04-05T09:39:50,887][ERROR][o.e.x.m.j.p.a.NativeAutodetectProcess] [node-1] [ashnikcom_web_traffic_count] autodetect process stopped unexpectedly:
[2019-04-05T09:39:50,926][ERROR][o.e.x.m.j.p.l.CppLogMessageHandler] [node-1] [controller/14730] [CDetachedProcessSpawner.cc@184] Child process with PID 26970 was terminated by signal 9
[2019-04-05T09:39:52,616][WARN ][o.e.x.m.j.p.a.o.AutoDetectResultProcessor] [node-1] [ashnikcom_web_traffic_count] some results not processed due to the termination of autodetect
[2019-04-05T09:39:52,653][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [node-1] Successfully set job state to [failed] for job [ashnikcom_web_traffic_count]
[2019-04-05T09:39:52,605][ERROR][o.e.x.m.j.p.a.AutodetectCommunicator] [node-1] [ashnikcom_web_traffic_count] Unexpected exception writing to process
org.elasticsearch.ElasticsearchException: [ashnikcom_web_traffic_count] Unexpected death of autodetect:
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectCommunicator.checkProcessIsAlive(AutodetectCommunicator.java:304) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectCommunicator.waitFlushToCompletion(AutodetectCommunicator.java:279) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectCommunicator.lambda$flushJob$4(AutodetectCommunicator.java:238) ~[?:?]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectCommunicator$1.doRun(AutodetectCommunicator.java:360) ~[?:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) ~[elasticsearch-6.5.4.jar:6.5.4]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.5.4.jar:6.5.4]
at org.elasticsearch.xpack.ml.job.process.autodetect.AutodetectProcessManager$AutodetectWorkerExecutorService.start(AutodetectProcessManager.java:779) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_191]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-6.5.4.jar:6.5.4]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
[2019-04-05T09:39:52,658][ERROR][o.e.x.m.j.p.a.AutodetectProcessManager] [node-1] [ashnikcom_web_traffic_count] exception while flushing job
[2019-04-05T09:39:59,663][WARN ][o.e.m.j.JvmGcMonitorService] [node-1] [gc][young][1826279][293818] duration [1.5s], collections [1]/[14.8s], total [1.5s]/[1.1h], memory [1.2gb]->[1.1gb]/[3.9gb], all_pools {[young] [194.9mb]->[98.8mb]/[266.2mb]}{[survivor] [9.5mb]->[19.7mb]/[33.2mb]}{[old] [1gb]->[1gb]/[3.6gb]}
[2019-04-05T09:58:49,408][INFO ][o.e.e.NodeEnvironment ] [node-1] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [145.3gb], net total_space [159.9gb], types [rootfs]
[2019-04-05T09:58:49,412][INFO ][o.e.e.NodeEnvironment ] [node-1] heap size [3.9gb], compressed ordinary object pointers [true]
[2019-04-05T09:58:50,032][INFO ][o.e.n.Node ] [node-1] node name [node-1], node ID [WUJSMbDxSTWSfrpSicLCBw]
[2019-04-05T09:58:50,032][INFO ][o.e.n.Node ] [node-1] version[6.5.4], pid[16822], build[default/rpm/d2ef93d/2018-12-17T21:17:40.758843Z], OS[Linux/3.10.0-862.2.3.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/12/12+33]
[2019-04-05T09:58:50,032][INFO ][o.e.n.Node ] [node-1] JVM arguments [-Xms4g, -Xmx4g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.k51UznFx, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/var/lib/elasticsearch, -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -XX:UseAVX=2, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/etc/elasticsearch, -Des.distribution.flavor=default, -Des.distribution.type=rpm]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [aggs-matrix-stats]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [analysis-common]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [ingest-common]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [lang-expression]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [lang-mustache]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [lang-painless]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [mapper-extras]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [parent-join]
[2019-04-05T09:58:52,095][INFO ][o.e.p.PluginsService ] [node-1] loaded module [percolator]
[2019-04-05T09:58:52,096][INFO ][o.e.p.PluginsService ] [node-1] loaded module [rank-eval]

Yes,but if i set JAVA_HOME to 12 which is required by rally, my ES node goes down.

Hey @Nikhil04,

The autodetect could be getting killed by the OS.

What OS and version are you running?

OOM killer on Linux could be axing it: https://github.com/elastic/elasticsearch/issues/22788#issuecomment-299886971

Seccomp could be killing it as well: https://github.com/elastic/ml-cpp/pull/354

The syslog of the system will have to be checked to find out why the OS is killing the process.

Hi @BenTrent @DavidTurner,

Thank you for your time.
I didn't do anything but Elasticsearch has now started successfully on the node which was failing.

That's good to hear.

Regarding the JVM version question:

I will admit I'm a bit confused about this. JDK12 is currently required to build the development versions of Elasticsearch, but of course you can use Rally to benchmark released versions too, and these released versions not necessarily work with JDK12.

My development environment is set up with multiple JVMs as described in the Rally docs, with environment variables called $JAVA8_HOME, ..., $JAVA12_HOME and I think that means Rally will choose the right JVM for its runs. If that's not the case I suggest opening another thread about this point so it doesn't get lost here.

Sure. I will check on that.

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.