I just tried upgrading my kibana installation, and upgraded to logstash, kibana, elasticsearch version 6.8.1
now that I have upgraded, I am getting the page "kibana server is not ready yet"
I am fairly new to asking for support and going through the forums, so apologies in advance if i left any information out. is there anyone who may be faimiliar with this condition after upgrading to 6.8.1?
[root@ca-sna-elk02 ~]# systemctl -l status kibana
â— kibana.service - Kibana
Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2019-07-05 23:19:29 PDT; 7h ago
Main PID: 21423 (node)
CGroup: /system.slice/kibana.service
└─21423 /usr/share/kibana/bin/../node/bin/node --no-warnings --max-http-header-size=65536 /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml
Jul 06 06:45:44 ca-sna-elk02.mercurytechnology.com kibana[21423]: {"type":"log","@timestamp":"2019-07-06T13:45:44Z","tags":["error","task_manager"],"pid":21423,"message":"Failed to poll for work: [search_phase_execution_exception] all shards failed :: {"path":"/.kibana_task_manager/_doc/_search","query":{"ignore_unavailable":true},"body":"{\"query\":{\"bool\":{\"must\":[{\"term\":{\"type\":\"task\"}},{\"bool\":{\"must\":[{\"terms\":{\"task.taskType\":[\"maps_telemetry\",\"vis_telemetry\"]}},{\"range\":{\"task.attempts\":{\"lte\":3}}},{\"range\":{\"task.runAt\":{\"lte\":\"now\"}}},{\"range\":{\"kibana.apiVersion\":{\"lte\":1}}}]}}]}},\"size\":10,\"sort\":{\"task.runAt\":{\"order\":\"asc\"}},\"seq_no_primary_term\":true}","statusCode":503,"response":"{\"error\":{\"root_cause\":,\"type\":\"search_phase_execution_exception\",\"reason\":\"all shards failed\",\"phase\":\"query\",\"grouped\":true,\"failed_shards\":},\"status\":503}"}"}
[root@ca-sna-elk02 ~]#
i checked the elastic-search logs, and i see a lot of the same repeats that look similar to the below but nothing that stands out in terms of a stacktrace
: 279079K->3529K(306688K), 0.0294922 secs] 2955564K->2680034K(4160256K), 0.0297563 secs] [Times: user=0.10 sys=0.00, real=0.03 secs]
2019-07-06T07:27:49.193-0700: 29765.209: Total time for which application threads were stopped: 0.0305303 seconds, Stopping threads took: 0.0001482 seconds
2019-07-06T07:27:49.196-0700: 29765.213: Total time for which application threads were stopped: 0.0022859 seconds, Stopping threads took: 0.0018357 seconds
2019-07-06T07:27:49.197-0700: 29765.213: Total time for which application threads were stopped: 0.0003249 seconds, Stopping threads took: 0.0000867 seconds
2019-07-06T07:28:06.515-0700: 29782.532: Total time for which application threads were stopped: 0.0006119 seconds, Stopping threads took: 0.0001356 seconds
2019-07-06T07:28:06.516-0700: 29782.532: Total time for which application threads were stopped: 0.0003234 seconds, Stopping threads took: 0.0000784 seconds
2019-07-06T07:28:06.516-0700: 29782.533: Total time for which application threads were stopped: 0.0003150 seconds, Stopping threads took: 0.0000751 seconds
2019-07-06T07:28:16.473-0700: 29792.490: Total time for which application threads were stopped: 0.0006392 seconds, Stopping threads took: 0.0001461 seconds
2019-07-06T07:28:16.474-0700: 29792.490: Total time for which application threads were stopped: 0.0004263 seconds, Stopping threads took: 0.0001250 seconds
2019-07-06T07:28:16.475-0700: 29792.491: Total time for which application threads were stopped: 0.0003651 seconds, Stopping threads took: 0.0000811 seconds
2019-07-06T07:28:28.854-0700: 29804.870: Total time for which application threads were stopped: 0.0005257 seconds, Stopping threads took: 0.0001341 seconds
2019-07-06T07:28:30.301-0700: 29806.318: Total time for which application threads were stopped: 0.0006203 seconds, Stopping threads took: 0.0001552 seconds
2019-07-06T07:28:30.302-0700: 29806.318: Total time for which application threads were stopped: 0.0003215 seconds, Stopping threads took: 0.0000768 seconds
2019-07-06T07:28:30.302-0700: 29806.319: Total time for which application threads were stopped: 0.0003576 seconds, Stopping threads took: 0.0000750 seconds
2019-07-06T07:29:10.909-0700: 29846.925: [GC (Allocation Failure) 2019-07-06T07:29:10.909-0700: 29846.925: [ParNew
Desired survivor size 17432576 bytes, new threshold 6 (max 6)
age 1: 3828008 bytes, 3828008 total
age 2: 204456 bytes, 4032464 total
age 3: 242632 bytes, 4275096 total
age 4: 10304 bytes, 4285400 total
age 5: 17568 bytes, 4302968 total
age 6: 15640 bytes, 4318608 total
: 275856K->5009K(306688K), 0.0228942 secs] 2952361K->2681522K(4160256K), 0.0231742 secs] [Times: user=0.08 sys=0.00, real=0.02 secs]
well seems that your elasticsearch logs are missing, can you check path.logs configuration on elasticsearch.yml and check if there are others logs somewhere?
If not try to use "sudo journalctl -xe -u <elasticsearch service name>
it looks like the log path is in a diff location, i do have the main log now and certainly see some errors. it is a 41mb file though. it looks there are so many errors i am not so sure where to even start
Caused by: java.nio.file.FileSystemException: /u01/elasticsearch/nodes/0/indices/LH43SX3dTSqUN0iszHdBnQ/0/translog/translog-155.ckp: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) ~[?:?]
at java.nio.file.Files.newByteChannel(Files.java:361) ~[?:1.8.0_212]
at java.nio.file.Files.newByteChannel(Files.java:407) ~[?:1.8.0_212]
at org.apache.lucene.store.SimpleFSDirectory.openInput(SimpleFSDirectory.java:77) ~[lucene-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:16:28]
at org.elasticsearch.index.translog.Checkpoint.read(Checkpoint.java:188) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.translog.Translog.recoverFromFiles(Translog.java:243) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.translog.Translog.(Translog.java:189) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:481) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:196) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:168) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1494) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:1444) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:440) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:95) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:310) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:93) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1737) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$9(IndexShard.java:2390) ~[elasticsearch-6.8.1.jar:6.8.1]
... 4 more
[2019-07-06T08:06:58,137][WARN ][o.e.i.c.IndicesClusterStateService] [elk02] [[wazuh-alerts-3.x-2018.03.19][0]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [wazuh-alerts-3.x-2018.03.19][0]: Recovery failed on {elk02}{DVhqmG-pStKCiiRaEO0Y_w}{sFw1Y8eyTDuY9AiodRM2_w}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=25008992256, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$9(IndexShard.java:2394) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-6.8.1.jar:6.8.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: org.elasticsearch.index.shard.IndexShardRecoveryException: failed to recover from gateway
at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:445) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:95) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:310) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:93) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1737) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$9(IndexShard.java:2390) ~[elasticsearch-6.8.1.jar:6.8.1]
... 4 more
Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: failed to create engine
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:208) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.engine.InternalEngine.(InternalEngine.java:168) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1494) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:1444) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:440) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:95) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:310) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:93) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1737) ~[elasticsearch-6.8.1.jar:6.8.1]
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$9(IndexShard.java:2390) ~[elasticsearch-6.8.1.jar:6.8.1]
... 4 more
Caused by: java.nio.file.FileSystemException: /u01/elasticsearch/nodes/0/indices/vpMoMx1TTf6SI2psS80o8A/0/translog/translog-12.ckp: Too many open files
Can you please check the max open files? login as the user which uses the elasticsearch service and use the following command:
ulimit -Hn
What OS are you using?
seems that the user have reached the max open files and the kernel doesn't allow it to open the files it needs, so elasticsearch cannot open the necessary files in order to open and recover the indices.
thank you, i have followed those steps and restarted the box still seeing the same results with the page unfortunately "Kibana server is not ready yet"
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.