Kibana service restarting after few seconds

prat · April 16, 2021, 7:49pm

We have kibana 7.4 installed on two servers where elasticsearch is also installed. Kibana service is getting restarted after some time.

Can someone please help.

Here is the kibana config file -

cat /etc/kibana/kibana.yml
# Ansible managed

server.port: 8601
server.host: "0.0.0.0"
server.name: kibana_2

elasticsearch.requestTimeout: 30000

elasticsearch.hosts: ['http://es1_ip:5200', 'http://es2_ip:5200', 'http://es3_ip:5200']
elasticsearch.username: "elastic"
elasticsearch.password: "XXXXXX"

kibana service file -

cat   /etc/systemd/system/kibana.service
[Unit]
Description=Kibana
StartLimitIntervalSec=30
StartLimitBurst=3

[Service]
Type=simple
User=kibana
Group=kibana
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/kibana
EnvironmentFile=-/etc/sysconfig/kibana
ExecStart=/usr/share/kibana/bin/kibana "-c /etc/kibana/kibana.yml"
Restart=always
WorkingDirectory=/

[Install]
WantedBy=multi-user.target

-rw-rw---- 1 root kibana  331 Apr 16 20:06 kibana.yml

Here is elasticsearch config -

cat /etc/elasticsearch/elasticsearch.yml
# Ansible managed

node.name: elasticsearch_1
path.data: /opt/elasticsearch
path.logs: /var/log/elasticsearch
http.port: 5200
transport.port: 5300
path.repo: /opt/elastic_snapshot

cluster.name: xxx
cluster.initial_master_nodes: ['es1_ip', 'es2_ip', es3_ip']
discovery.seed_hosts: ['es1_ip', 'es2_ip', es3_ip']
network.host: 0.0.0.0
http.host: 0.0.0.0
node.master: True
node.data: True
node.ingest: True
discovery.zen.minimum_master_nodes: 2
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: elasticsearch_1.p12
xpack.security.transport.ssl.truststore.path: elasticsearch_1.p12
# xpack.security.http.ssl.verification_mode: certificate
# xpack.security.http.ssl.keystore.path: elasticsearch_1.p12
# xpack.security.http.ssl.truststore.path: elasticsearch_1.p12

xpack.monitoring.collection.enabled: true
xpack.security.enabled: true

both process are showing as running but kibana gets restarted.

ps -ef |grep kibana
root      99171  93611  0 20:26 pts/0    00:00:00 grep --color=auto kibana
kibana    99172      1  0 20:26 ?        00:00:00 /usr/share/kibana/bin/../node/bin/node /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

ps -ef |grep elasticsearch
elastic+  90792      1  3 Apr15 ?        00:42:43 /usr/share/elasticsearch/jdk/bin/java -Xms5288m -Xmx5288m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch-9885684502051401189 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Djava.locale.providers=COMPAT -Xms5288m -Xmx5288m -Dio.netty.allocator.type=pooled -XX:MaxDirectMemorySize=2772434944 -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/etc/elasticsearch -Des.distribution.flavor=default -Des.distribution.type=rpm -Des.bundled_jdk=true -cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid --quiet
elastic+  90948  90792  0 Apr15 ?        00:00:00 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller
root      99588  93611  0 20:28 pts/0    00:00:00 grep --color=auto elasticsearch

systemctl status elasticsearch.service -l
● elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/elasticsearch.service.d
           └─startup-timeout.conf
   Active: active (running) since Thu 2021-04-15 21:13:31 IST; 22h ago
     Docs: http://www.elastic.co
 Main PID: 108327 (java)
   CGroup: /system.slice/elasticsearch.service
           ├─108327 /usr/share/elasticsearch/jdk/bin/java -Xms5288m -Xmx5288m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch-4761301156106937349 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Djava.locale.providers=COMPAT -Xms5288m -Xmx5288m -Dio.netty.allocator.type=pooled -XX:MaxDirectMemorySize=2772434944 -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/etc/elasticsearch -Des.distribution.flavor=default -Des.distribution.type=rpm -Des.bundled_jdk=true -cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid --quiet
           └─108449 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

Apr 15 21:12:15 systemd[1]: Starting Elasticsearch...
Apr 15 21:12:22 elasticsearch[108327]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Apr 15 21:13:31 systemd[1]: Started Elasticsearch.

Getting connection refused when checking kibana.

curl -I http://127.0.0.1:8601/status
HTTP/1.1 503 Service Unavailable
retry-after: 30
content-type: text/html; charset=utf-8
cache-control: no-cache
content-length: 30
Date: Fri, 16 Apr 2021 14:39:47 GMT
Connection: keep-alive

curl -s http://127.0.0.1:8601/api/status
Kibana server is not ready yet

curl -v -s http://127.0.0.1:8601/api/status | jsonpp
* About to connect() to 127.0.0.1 port 8601 (#0)
*   Trying 127.0.0.1...
* Connection refused
* Failed connect to 127.0.0.1:8601; Connection refused
* Closing connection 0

Getting below error while checking the health of es.

curl -XGET 'http://localhost:5200/_cluster/health' -u elastic
Enter host password for user 'elastic':
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

prat · April 16, 2021, 8:08pm

Below are the kibana logs -

APR 16 19:47:56  kibana[59465]: {"type":"log","@timestamp":"2021-04-16T14:17:56Z","tags":["reporting","warning"],"pid":59465,"message":"Generating a random key for xpack.reporting.encryptionKey. To prevent pending reports from failing on restart, please set xpack.reporting.encryptionKey in kibana.yml"}
kibana[59465]: {"type":"log","@timestamp":"2021-04-16T14:17:56Z","tags":["status","plugin:reporting@7.4.0","error"],"pid":59465,"state":"red","message":"Status changed from uninitialized to red - [data] Elasticsearch cluster did not respond with license information.","prevState":"uninitialized","prevMsg":"uninitialized"}
kibana[59465]: {"type":"log","@timestamp":"2021-04-16T14:17:56Z","tags":["status","plugin:security@7.4.0","error"],"pid":59465,"state":"red","message":"Status changed from green to red - [data] Elasticsearch cluster did not respond with license information.","prevState":"green","prevMsg":"Ready"}
kibana[59465]: Could not create APM Agent configuration: Request Timeout after 30000ms
kibana[59465]: {"type":"error","","tags":["warning","process"],"pid":59465,"level":"error","error":
{"message":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)","name":"UnhandledPromiseRejectionWarning","stack":"UnhandledPromiseRejectionWarning: Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)\n    at emitWarning (internal/process/promises.js:81:15)\n    at emitPromiseRejectionWarnings (internal/process/promises.js:120:9)\n    at process._tickCallback (internal/process/next_tick.js:69:34)"},"message":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnT
imeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)"}
kibana[59465]: {"type":"error","","tags":["warning","process"],"pid":59465,"level":"error","error":
{"message":"Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)","name":"UnhandledPromiseRejectionWarning","stack":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_mo
dules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (
timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)"},"message":"Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)"}
   kibana[59465]: {"type":"log","","tags":["reporting","warning"],"pid":59465,"message":"Reporting plugin self-check failed. Please check the Kibana Reporting settings. Error: Request Timeout after 30000ms"}
   kibana[59465]: {"type":"log","","tags":["fatal","root"],"pid":59465,"message":"{ Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)\n  status: undefined,\n  displayName: 'RequestTimeout',\n  message: 'Request Timeout after 30000ms',\n  body: undefined,\n  isBoom: true,\n  isServer: true,\n  data: null,\n  output:\n   { statusCode: 503,\n     payload:\n      { statusCode: 503,\n        error: 'Service Unavailable',\n        message: 'Request Timeout after 30000ms' },\n     headers: {} },\n  reformat: [Function],\n  [Symbol(SavedObjectsClientErrorCode)]: 'SavedObjectsClient/esUnavailable' }"}
   kibana[59465]: {"type":"log","","tags":["info","plugins-system"],"pid":59465,"message":"Stopping all plugins."}
   kibana[59465]: {"type":"log","","tags":["info","plugins","data"],"pid":59465,"message":"Stopping plugin"}
   kibana[59465]: FATAL  Error: Request Timeout after 30000ms
   systemd[1]: kibana.service: main process exited, code=exited, status=1/FAILURE
   systemd[1]: Unit kibana.service entered failed state.
   systemd[1]: kibana.service failed.
   systemd[1]: kibana.service holdoff time over, scheduling restart.
   systemd[1]: Stopped Kibana.
   systemd[1]: Started Kibana.
   kibana[59741]: {"type":"log","","tags":["info","plugins-system"],"pid":59741,"message":"Setting up [4] plugins: [security,translations,inspector,data]"}
   kibana[59741]: {"type":"log","","tags":["info","plugins","security"],"pid":59741,"message":"Setting up plugin"}
   kibana[59741]: {"type":"log","","tags":["warning","plugins","security","config"],"pid":59741,"message":"Generating a random key for xpack.security.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.security.encryptionKey in kibana.yml"}
   kibana[59741]: {"type":"log","","tags":["warning","plugins","security","config"],"pid":59741,"message":"Session cookies will be transmitted over insecure connections. This is not recommended."}
   kibana[59741]: {"type":"log","","tags":["info","plugins","translations"],"pid":59741,"message":"Setting up plugin"}
   kibana[59741]: {"type":"log","","tags":["info","plugins","data"],"pid":59741,"message":"Setting up plugin"}
   kibana[59741]: {"type":"log","","tags":["info","plugins-system"],"pid":59741,"message":"Starting [3] plugins: [security,translations,data]"}
   kibana[59741]: {"type":"log","","tags":["status","plugin:kibana@7.4.0","info"],"pid":59741,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
   kibana[59741]: {"type":"log","","tags":["status","plugin:elasticsearch@7.4.0","info"],"pid":59741,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
   kibana[59741]: {"type":"log","","tags":["status","plugin:xpack_main@7.4.0","info"],"pid":59741,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}

   kibana[69723]: {"type":"error",,"tags":["warning","process"],"pid":69723,"level":"error","error":{"message":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)","name":"UnhandledPromiseRejectionWarning","stack":"UnhandledPromiseRejectionWarning: Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)\n    at emitWarning (internal/process/promises.js:81:15)\n    at emitPromiseRejectionWarnings (internal/process/promises.js:120:9)\n    at process._tickCallback (internal/process/next_tick.js:69:34)"},"message":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)"}
   kibana[69723]: {"type":"error",,"tags":["warning","process"],"pid":69723,"level":"error","error":{"message":"Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)","name":"UnhandledPromiseRejectionWarning","stack":"Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)"},"message":"Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)"}
   kibana[69723]: {"type":"log",,"tags":["debug","monitoring","kibana-monitoring"],"pid":69723,"message":"Received Kibana Ops event data"}
   kibana[69723]: {"type":"log",,"tags":["license","debug","xpack"],"pid":69723,"message":"Calling [data] Elasticsearch _xpack API. Polling frequency: 30001"}
   kibana[69723]: {"type":"log",,"tags":["plugin","debug"],"pid":69723,"message":"Checking Elasticsearch version"}
   kibana[69723]: {"type":"log",,"tags":["reporting","warning"],"pid":69723,"message":"Reporting plugin self-check failed. Please check the Kibana Reporting settings. Error: Request Timeout after 30000ms"}
   kibana[69723]: {"type":"log",,"tags":["debug","upgrade_assistant","reindex_worker"],"pid":69723,"message":"Stopping worker..."}
   kibana[69723]: {"type":"log",,"tags":["debug","upgrade_assistant","reindex_worker"],"pid":69723,"message":"Could not fetch riendex operations from Elasticsearch"}
   kibana[69723]: {"type":"log",,"tags":["debug","root"],"pid":69723,"message":"shutting root down"}
   kibana[69723]: {"type":"log",,"tags":["fatal","root"],"pid":69723,"message":"{ Error: Request Timeout after 30000ms\n    at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:397:9\n    at Timeout.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:429:7)\n    at ontimeout (timers.js:436:11)\n    at tryOnTimeout (timers.js:300:5)\n    at listOnTimeout (timers.js:263:5)\n    at Timer.processTimers (timers.js:223:10)\n  status: undefined,\n  displayName: 'RequestTimeout',\n  message: 'Request Timeout after 30000ms',\n  body: undefined,\n  isBoom: true,\n  isServer: true,\n  data: null,\n  output:\n   { statusCode: 503,\n     payload:\n      { statusCode: 503,\n        error: 'Service Unavailable',\n        message: 'Request Timeout after 30000ms' },\n     headers: {} },\n  reformat: [Function],\n  [Symbol(SavedObjectsClientErrorCode)]: 'SavedObjectsClient/esUnavailable' }"}
   kibana[69723]: {"type":"log",,"tags":["debug","server"],"pid":69723,"message":"stopping server"}
   kibana[69723]: {"type":"log",,"tags":["info","plugins","translations"],"pid":69723,"message":"Stopping plugin"}
   kibana[69723]: {"type":"log",,"tags":["debug","elasticsearch-service"],"pid":69723,"message":"Closing elasticsearch clients"}
   kibana[69723]: {"type":"log",,"tags":["debug","http","server","Kibana"],"pid":69723,"message":"stopping http server"}
   kibana[69723]: FATAL  Error: Request Timeout after 30000ms
   systemd[1]: kibana.service: main process exited, code=exited, status=1/FAILURE
   systemd[1]: Unit kibana.service entered failed state.
   systemd[1]: kibana.service failed.
   systemd[1]: kibana.service holdoff time over, scheduling restart.
   systemd[1]: Stopped Kibana.
   systemd[1]: Started Kibana.
   kibana[69835]: {"type":"log","@timestamp":"2021-04-16T14:37:06Z","tags":["debug","config"],"pid":69835,"message":"Marking config path as handled: server"}
   kibana[69835]: {"type":"log","@timestamp":"2021-04-16T14:37:10Z","tags":["debug","http"],"pid":69835,"message":"Kibana server is not ready yet get:/login."}

prat · April 16, 2021, 8:12pm

es logs

[2021-04-15T18:20:01,704][INFO ][o.e.c.c.JoinHelper       ] [elasticsearch_1] failed to join {elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{eIe879wqQqWn4Al0r_IHcA}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={elasticsearch_1}{6OyoyA40S3-z6M7nAnCGIA}{1ltohL1FS6iNd230MM-EcA}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=106925, lastAcceptedTerm=106923, lastAcceptedVersion=422302, sourceNode={elasticsearch_1}{6OyoyA40S3-z6M7nAnCGIA}{1ltohL1FS6iNd230MM-EcA}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{eIe879wqQqWn4Al0r_IHcA}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:cluster/coordination/join]
Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
	at org.elasticsearch.cluster.coordination.Coordinator$2.onFailure(Coordinator.java:513) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1120) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:243) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.4.0.jar:7.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_1][10.191.156.153:5300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 6cgqHB-bTyqnV37mWj3IZg than local cluster uuid P2CGl6sJQjOAuGxTM92Akw, rejecting
	at java.lang.Thread.run(Thread.java:830) ~[?:?]
[2021-04-15T18:20:02,702][INFO ][o.e.c.c.JoinHelper       ] [elasticsearch_1] failed to join {elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{eIe879wqQqWn4Al0r_IHcA}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={elasticsearch_1}{6OyoyA40S3-z6M7nAnCGIA}{1ltohL1FS6iNd230MM-EcA}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=106925, lastAcceptedTerm=106923, lastAcceptedVersion=422302, sourceNode={elasticsearch_1}{6OyoyA40S3-z6M7nAnCGIA}{1ltohL1FS6iNd230MM-EcA}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{eIe879wqQqWn4Al0r_IHcA}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:cluster/coordination/join]
Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
	at org.elasticsearch.cluster.coordination.Coordinator$2.onFailure(Coordinator.java:513) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1120) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:243) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.4.0.jar:7.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_1][10.191.156.153:5300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 6cgqHB-bTyqnV37mWj3IZg than local cluster uuid P2CGl6sJQjOAuGxTM92Akw, rejecting

[2021-04-15T18:56:55,127][WARN ][r.suppressed             ] [elasticsearch_1] path: /_monitoring/bulk, params: {system_id=logstash, system_api_version=7, interval=1s}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized, SERVICE_UNAVAILABLE/2/no master];
	at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:175) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.monitoring.action.TransportMonitoringBulkAction.doExecute(TransportMonitoringBulkAction.java:55) ~[?:?]
	at org.elasticsearch.xpack.monitoring.action.TransportMonitoringBulkAction.doExecute(TransportMonitoringBulkAction.java:35) ~[?:?]
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:153) [elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$apply$0(SecurityActionFilter.java:86) [x-pack-security-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$authorizeRequest$4(SecurityActionFilter.java:172) [x-pack-security-7.4.0.jar:7.4.0]
	
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:43) [elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:120) [x-pack-core-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$consumeToken$13(AuthenticationService.java:374) [x-pack-security-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.security.authc.support.CachingUsernamePasswordRealm.lambda$authenticateWithCache$1(CachingUsernamePasswordRealm.java:145) [x-pack-security-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.security.authc.support.CachingUsernamePasswordRealm.handleCachedAuthentication(CachingUsernamePasswordRealm.java:196) [x-pack-security-7.4.0.jar:7.4.0]
	at org.elasticsearch.xpack.security.authc.support.CachingUsernamePasswordRealm.lambda$authenticateWithCache$2(CachingUsernamePasswordRealm.java:137) [x-pack-security-7.4.0.jar:7.4.0]

	at org.elasticsearch.xpack.security.rest.SecurityRestFilter.handleRequest(SecurityRestFilter.java:55) [x-pack-security-7.4.0.jar:7.4.0]
	at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:222) [elasticsearch-7.4.0.jar:7.4.0]
	
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) [netty-common-4.1.38.Final.jar:4.1.38.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.38.Final.jar:4.1.38.Final]
	at java.lang.Thread.run(Thread.java:830) [?:?]
[2021-04-15T18:56:55,239][DEBUG][o.e.a.a.i.c.TransportCreateIndexAction] [elasticsearch_1] timed out while retrying [indices:admin/create] after failure (timeout [1m])
[2021-04-15T18:56:55,506][INFO ][o.e.c.c.JoinHelper       ] [elasticsearch_1] failed to join {elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{eIe879wqQqWn4Al0r_IHcA}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={elasticsearch_1}{6OyoyA40S3-z6M7nAnCGIA}{1ltohL1FS6iNd230MM-EcA}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=106925, lastAcceptedTerm=106923, lastAcceptedVersion=422302, sourceNode={elasticsearch_1}{6OyoyA40S3-z6M7nAnCGIA}{1ltohL1FS6iNd230MM-EcA}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{eIe879wqQqWn4Al0r_IHcA}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:cluster/coordination/join]
Caused by: java.lang.IllegalStateException: failure when sending a validation request to node
	at org.elasticsearch.cluster.coordination.Coordinator$2.onFailure(Coordinator.java:513) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:59) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1120) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.transport.InboundHandler.lambda$handleException$2(InboundHandler.java:243) ~[elasticsearch-7.4.0.jar:7.4.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.4.0.jar:7.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_1][10.191.156.153:5300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 6cgqHB-bTyqnV37mWj3IZg than local cluster uuid P2CGl6sJQjOAuGxTM92Akw, rejecting

flash1293 · April 19, 2021, 7:28am

This is an Elasticsearch problem - Kibana can't start if Elasticsearch is not running properly which is the case here (master_not_discovered_exception)

Judging from the ES logs, the individual nodes form individual clusters for some reason (join validation on cluster state with a different cluster uuid 6cgqHB-bTyqnV37mWj3IZg than local cluster uuid P2CGl6sJQjOAuGxTM92Akw, rejecting).

Is it possible the individual nodes have some stale state in their data directories, so they believe the belong to another cluster?

prat · April 19, 2021, 8:25am

How can we delete stale state from data directories?

flash1293 · April 19, 2021, 8:45am

I'm no expert on this, but deleting the contents should work fine here. Only do this If you don't have data stored in this cluster you want to retain (as everything will be lost)

prat · April 19, 2021, 12:42pm

Thanks for reply.

How can we check what caused nodes to join individual cluster.

My path.data in config file is /opt/elasticsearch, do you mean to say everything under /opt/elasticsearch needs to be remove?

OR, there is _state directory inside /opt/elasticsearch/nodes/0/_state , is it enough to only remove this?

prat · April 20, 2021, 8:34am

I have moved data from /opt/elasticsearch/ to some other location and restarted elasticsearch and kibana.

on second kibana node, now it is running from some time.

 systemctl status kibana -l
● kibana.service - Kibana
   Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-04-20 12:25:20 IST; 18min ago
 Main PID: 19871 (node)
   CGroup: /system.slice/kibana.service
           └─19871 /usr/share/kibana/bin/../node/bin/node /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

Apr 20 12:25:37 <Hostname> kibana[19871]: {"type":"log","@timestamp":"2021-04-20T06:55:37Z","tags":["status","plugin:cross_cluster_replication@7.4.0","info"],"pid":19871,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
Apr 20 12:25:37 <Hostname> kibana[19871]: {"type":"log","@timestamp":"2021-04-20T06:55:37Z","tags":["status","plugin:file_upload@7.4.0","info"],"pid":19871,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
Apr 20 12:25:37 <Hostname> kibana[19871]: {"type":"log","@timestamp":"2021-04-20T06:55:37Z","tags":["status","plugin:snapshot_restore@7.4.0","info"],"pid":19871,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
Apr 20 12:25:37 <Hostname> kibana[19871]: {"type":"log","@timestamp":"2021-04-20T06:55:37Z","tags":["info","monitoring","kibana-monitoring"],"pid":19871,"message":"Starting monitoring stats collection"}["status","plugin:reporting@7.4.0","info"],"pid":19871,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
Apr 20 12:25:38 <Hostname> kibana[19871]: {"type":"log","@timestamp":"2021-04-20T06:55:38Z","tags":["info","migrations"],"pid":19871,"message":"Creating index .kibana_task_manager_1."}
Apr 20 12:25:38 <Hostname> kibana[19871]: {"type":"log","@timestamp":"2021-04-20T06:55:38Z","tags":["warning","migrations"],"pid":19871,"message":"Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_task_manager_1 and restarting Kibana."}
Apr 20 12:27:54 <Hostname> kibana[19871]: {"type":"log","@timestamp":"2021-04-20T06:57:54Z","tags":["error","elasticsearch","admin"],"pid":19871,"message":"Request error, retrying\nGET http://10.191.156.155:5200/.kibana_task_manager => socket hang up"}

looks es started successfully, es logs

[2021-04-20T12:24:35,185][INFO ][o.e.l.LicenseService     ] [elasticsearch_2] license [cb23e374-5a9d-4f15-b2be-c9cd49b41f3b] mode [basic] - valid
[2021-04-20T12:24:35,186][INFO ][o.e.x.s.s.SecurityStatusChangeListener] [elasticsearch_2] Active license is now [BASIC]; Security is enabled
[2021-04-20T12:24:35,231][INFO ][o.e.h.AbstractHttpServerTransport] [elasticsearch_2] publish_address {10.191.156.154:5200}, bound_addresses {0.0.0.0:5200}
[2021-04-20T12:24:35,232][INFO ][o.e.n.Node               ] [elasticsearch_2] started
[2021-04-20T12:27:33,508][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] master node changed {previous [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}], current []}, term: 106927, version: 3877, reason: becoming candidate: joinLeaderInTerm
[2021-04-20T12:27:33,627][INFO ][o.e.c.s.MasterService    ] [elasticsearch_2] elected-as-master ([2] nodes joined)[{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} elect leader, {elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 106929, version: 3878, reason: master node changed {previous [], current [{elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}]}
[2021-04-20T12:27:33,903][INFO ][o.e.c.c.JoinHelper       ] [elasticsearch_2] failed to join {elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=106928, lastAcceptedTerm=106927, lastAcceptedVersion=3877, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_1][10.191.156.153:5300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: node is no longer master for term 106929 while handling publication
        at org.elasticsearch.cluster.coordination.Coordinator.publish(Coordinator.java:1049) ~[elasticsearch-7.4.0.jar:7.4.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:830) [?:?]
[2021-04-20T12:27:33,906][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] master node changed {previous [], current [{elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}]}, term: 106929, version: 3878, reason: Publication{term=106929, version=3878}
[2021-04-20T12:27:34,008][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [elasticsearch_2] failed to execute on node [rpPhLWZzSWmVpyTWXl2PmQ]
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][cluster:monitor/nodes/stats[n]]
Caused by: java.lang.IllegalStateException: environment is not locked

Caused by: java.nio.file.NoSuchFileException: /opt/elasticsearch/nodes/0/node.lock
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]

[2021-04-20T12:27:34,042][WARN ][o.e.g.G.InternalReplicaShardAllocator] [elasticsearch_2] [metricbeat-7.4.0-2021.04.20][0]: failed to list shard for shard_store on node [rpPhLWZzSWmVpyTWXl2PmQ]
org.elasticsearch.action.FailedNodeException: Failed node [rpPhLWZzSWmVpyTWXl2PmQ]

Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:cluster/nodes/indices/shard/store[n]]
Caused by: java.lang.IllegalStateException: environment is not locked

[2021-04-20T12:27:34,052][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] failing shard [failed shard, shard [apm-7.4.0-2021.04.20][0], node[rpPhLWZzSWmVpyTWXl2PmQ], [R], s[STARTED], a[id=cz31swkJRyiHP457BbSmUg], message [failed to perform indices:data/write/bulk[s] on replica [apm-7.4.0-2021.04.20][0], node[rpPhLWZzSWmVpyTWXl2PmQ], [R], s[STARTED], a[id=cz31swkJRyiHP457BbSmUg]], failure [RemoteTransportException[[elasticsearch_3][10.191.156.155:5300][indices:data/write/bulk[s][r]]]; nested: AlreadyClosedException[translog is already closed]; nested: NoSuchFileException[/opt/elasticsearch/nodes/0/indices/s-IAyTIsTvqGRyborHLr_w/0/translog/translog.ckp]; ], markAsStale [true]]
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][indices:data/write/bulk[s][r]]
Caused by: org.apache.lucene.store.AlreadyClosedException: translog is already closed
        at org.elasticsearch.index.translog.Translog.ensureOpen(Translog.java:1784) ~[elasticsearch-7.4.0.jar:7.4.0]
        at org.elasticsearch.index.translog.Translog.add(Translog.java:546) ~[elasticsearch-7.4.0.jar:7.4.0]
        at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:918) ~[elasticsearch-7.4.0.jar:7.4.0]

Caused by: java.nio.file.NoSuchFileException: /opt/elasticsearch/nodes/0/indices/s-IAyTIsTvqGRyborHLr_w/0/translog/translog.ckp
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]

Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:cluster/nodes/indices/shard/store[n]]
Caused by: java.lang.IllegalStateException: environment is not locked

[2021-04-20T12:27:35,687][INFO ][o.e.c.s.MasterService    ] [elasticsearch_2] node-left[{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} followers check retry count exceeded], term: 106929, version: 3881, reason: removed {{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}
[2021-04-20T12:27:36,433][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] removed {{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3881, reason: Publication{term=106929, version=3881}
[2021-04-20T12:27:36,472][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [heartbeat-7.4.0-2021.04.17][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,473][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [.monitoring-es-7-2021.04.15][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,507][DEBUG][o.e.a.a.i.g.TransportGetIndexAction] [elasticsearch_2] connection exception while trying to forward request with action name [indices:admin/get] to master node [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}], scheduling a retry. Error: [org.elasticsearch.transport.NodeDisconnectedException: [elasticsearch_3][10.191.156.155:5300][indices:admin/get] disconnected]
[2021-04-20T12:27:36,511][INFO ][o.e.c.r.DelayedAllocationService] [elasticsearch_2] scheduling reroute for delayed shards in [59.1s] (27 delayed shards)
[2021-04-20T12:27:36,512][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [apm-7.4.0-2021.04.15][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,592][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [metricbeat-7.4.0-2021.04.16][0] primary-replica resync completed with 0 operations
[2021-04-20T12:28:36,140][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] [filebeat-7.4.0-2021.04.20][0] marking unavailable shards as stale: [L3YU4w2OSG2Nb05AK4bv_g]
[2021-04-20T12:28:36,246][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] [metricbeat-7.4.0-2021.04.19][0] marking unavailable shards as stale: [PqNnaeWEQQG2yXyEuY8sjw]
[2021-04-20T12:28:39,344][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] added {{elasticsearch_3}{Nzkhpw9zQoO-m9vPi7yhqA}{k3ySDUTcQAKMh9Az0veQOw}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3892, reason: Publication{term=106929, version=3892}

es health on three nodes shows fine.

curl -XGET 'http://localhost:5200/_cluster/health' -u elastic
Enter host password for user 'elastic':
{"cluster_name":"sbibh-uat","status":"green","timed_out":false,"number_of_nodes":3,"number_of_data_nodes":3,"active_primary_shards":40,"active_shards":80,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

kibana still having some issue

curl -I http://127.0.0.1:8601/status
HTTP/1.1 503 Service Unavailable
retry-after: 30
content-type: text/html; charset=utf-8
cache-control: no-cache
content-length: 30
Date: Tue, 20 Apr 2021 07:21:19 GMT
Connection: keep-alive

curl -s http://127.0.0.1:8601/api/status
Kibana server is not ready yet

prat · April 20, 2021, 8:38am

On the first node, kibana is still restating

systemctl status  kibana -l
● kibana.service - Kibana
   Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-04-20 12:55:09 IST; 1s ago
 Main PID: 121196 (node)
   CGroup: /system.slice/kibana.service
           └─121196 /usr/share/kibana/bin/../node/bin/node /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

Apr 20 12:55:09 itfoobpnoneuapp3uat systemd[1]: Started Kibana.

systemctl status  elasticsearch.service  -l
● elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/elasticsearch.service.d
           └─startup-timeout.conf
   Active: active (running) since Tue 2021-04-20 12:16:16 IST; 38min ago
     Docs: http://www.elastic.co
 Main PID: 113870 (java)
   CGroup: /system.slice/elasticsearch.service
           ├─113870 /usr/share/elasticsearch/jdk/bin/java -Xms5288m -Xmx5288m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch-14603039941648378243 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/elasticsearch -XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Djava.locale.providers=COMPAT -Xms5288m -Xmx5288m -Dio.netty.allocator.type=pooled -XX:MaxDirectMemorySize=2772434944 -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/etc/elasticsearch -Des.distribution.flavor=default -Des.distribution.type=rpm -Des.bundled_jdk=true -cp /usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid --quiet
           └─114036 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

Apr 20 12:15:18 itfoobpnoneuapp3uat systemd[1]: Starting Elasticsearch...
Apr 20 12:15:27 itfoobpnoneuapp3uat elasticsearch[113870]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Apr 20 12:16:16 itfoobpnoneuapp3uat systemd[1]: Started Elasticsearch.

es logs,

[2021-04-20T12:16:12,342][INFO ][o.e.t.TransportService   ] [elasticsearch_1] publish_address {10.191.156.153:5300}, bound_addresses {0.0.0.0:5300}
[2021-04-20T12:16:12,352][INFO ][o.e.b.BootstrapChecks    ] [elasticsearch_1] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2021-04-20T12:16:13,566][INFO ][o.e.c.c.Coordinator      ] [elasticsearch_1] setting initial configuration to VotingConfiguration{tadP0OIIQ3KZZF4BFK6_ig,k4u-jf-zTZihWzqMNGnQ4w,rpPhLWZzSWmVp
yTWXl2PmQ}
[2021-04-20T12:16:14,947][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] master node changed {previous [], current [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}
{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}]}, added {{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_p
eg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},{elasticsearch_2}{k4u-jf-zTZihWzqMNGnQ4w}{GwV94yfTRH6NSdqfUTeAhw}{10.
191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106927, version: 3708, reason: ApplyCommitRequest{term=106927, vers
ion=3708, sourceNode={elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.in
stalled=true}}
[2021-04-20T12:16:15,255][INFO ][o.e.x.s.a.TokenService   ] [elasticsearch_1] refresh keys
[2021-04-20T12:16:15,448][INFO ][o.e.x.s.a.TokenService   ] [elasticsearch_1] refreshed keys
[2021-04-20T12:16:15,927][INFO ][o.e.l.LicenseService     ] [elasticsearch_1] license [cb23e374-5a9d-4f15-b2be-c9cd49b41f3b] mode [basic] - valid
[2021-04-20T12:16:15,928][INFO ][o.e.x.s.s.SecurityStatusChangeListener] [elasticsearch_1] Active license is now [BASIC]; Security is enabled
[2021-04-20T12:16:16,163][INFO ][o.e.h.AbstractHttpServerTransport] [elasticsearch_1] publish_address {10.191.156.153:5200}, bound_addresses {0.0.0.0:5200}
[2021-04-20T12:16:16,164][INFO ][o.e.n.Node               ] [elasticsearch_1] started
[2021-04-20T12:20:00,987][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [elasticsearch_1] received plaintext traffic on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/10.191.156.153:5300, remoteAddress=/10.191.156.153:44978}
[2021-04-20T12:22:12,635][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] removed {{elasticsearch_2}{k4u-jf-zTZihWzqMNGnQ4w}{GwV94yfTRH6NSdqfUTeAhw}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106927, version: 3761, reason: ApplyCommitRequest{term=106927, version=3761, sourceNode={elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}
[2021-04-20T12:22:12,892][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [metricbeat-7.4.0-2021.04.18][0] primary-replica resync completed with 0 operations
[2021-04-20T12:22:12,893][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [filebeat-7.4.0-2021.04.19][0] primary-replica resync completed with 0 operations

[2021-04-20T12:28:19,589][INFO ][o.e.c.c.Coordinator      ] [elasticsearch_1] master node [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}] failed, restarting discovery
org.elasticsearch.ElasticsearchException: node [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}] failed [3] consecutive checks

Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:coordination/fault_detection/leader_check]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: non-leader rejecting leader check
        at org.elasticsearch.cluster.coordination.LeaderChecker.handleLeaderCheck(LeaderChecker.java:178) ~[elasticsearch-7.4.0.jar:7.4.0]
  
[2021-04-20T12:28:19,599][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] master node changed {previous [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}], current []}, term: 106927, version: 3877, reason: becoming candidate: onLeaderFailure
[2021-04-20T12:28:19,840][INFO ][o.e.c.s.MasterService    ] [elasticsearch_1] elected-as-master ([2] nodes joined)[{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} elect leader, {elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 106929, version: 3878, reason: master node changed {previous [], current [{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}]}
[2021-04-20T12:28:20,041][WARN ][o.e.c.s.MasterService    ] [elasticsearch_1] failing [elected-as-master ([2] nodes joined)[{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} elect leader, {elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_]]: failed to commit cluster state version [3878]
org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: node is no longer master for term 106929 while handling publication
        at org.elasticsearch.cluster.coordination.Coordinator.publish(Coordinator.java:1049) ~[elasticsearch-7.4.0.jar:7.4.0]
        at org.elasticsearch.cluster.service.MasterService.publish(MasterService.java:268) [elasticsearch-7.4.0.jar:7.4.0]

[2021-04-20T12:28:20,045][INFO ][o.e.c.c.JoinHelper       ] [elasticsearch_1] failed to join {elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} with JoinRequest{sourceNode={elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=106928, lastAcceptedTerm=106927, lastAcceptedVersion=3877, sourceNode={elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch_1}

[2021-04-20T12:28:20,047][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] master node changed {previous [], current [{elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}]}, term: 106929, version: 3878, reason: ApplyCommitRequest{term=106929, version=3878, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}
[2021-04-20T12:28:22,304][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] removed {{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3881, reason: ApplyCommitRequest{term=106929, version=3881, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}
[2021-04-20T12:28:22,413][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [apm-7.4.0-2021.04.19][0] primary-replica resync completed with 0 operations
[2021-04-20T12:28:22,498][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [filebeat-7.4.0-2021.04.20][0] primary-replica resync completed with 0 operations
[2021-04-20T12:28:23,013][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [.monitoring-es-7-2021.04.20][0] primary-replica resync completed with 0 operations
[2021-04-20T12:29:24,586][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] added {{elasticsearch_3}{Nzkhpw9zQoO-m9vPi7yhqA}{k3ySDUTcQAKMh9Az0veQOw}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3892, reason: ApplyCommitRequest{term=106929, version=3892, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}
(END)

curl -XGET 'http://localhost:5200/_cluster/health' -u elastic
Enter host password for user 'elastic':
{"cluster_name":"sbibh-uat","status":"green","timed_out":false,"number_of_nodes":3,"number_of_data_nodes":3,"active_primary_shards":40,"active_shards":80,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

kibana here giving connection refused.

curl -v -s http://127.0.0.1:8601/api/status | jsonpp
* About to connect() to 127.0.0.1 port 8601 (#0)
*   Trying 127.0.0.1...
* Connection refused
* Failed connect to 127.0.0.1:8601; Connection refused
* Closing connection 0

flash1293 · April 20, 2021, 9:20am

Check the Kibana logs for the specific error it's encountering

prat · April 20, 2021, 3:16pm

was getting below permission issue on the node where kibana was restarting. set the ownership to kibana and can see kibana is running now.

Apr 20 17:15:43 itfoobpnoneuapp3uat systemd: Stopped Kibana.
Apr 20 17:15:43 itfoobpnoneuapp3uat systemd: Started Kibana.
Apr 20 17:15:47 itfoobpnoneuapp3uat kibana: fs.js:115
Apr 20 17:15:47 itfoobpnoneuapp3uat kibana: throw err;
Apr 20 17:15:47 itfoobpnoneuapp3uat kibana: ^
Apr 20 17:15:47 itfoobpnoneuapp3uat kibana: Error: EACCES: permission denied, open '/usr/share/kibana/optimize/.babel_register_cache.json'

on both the server, kibana is running persistently but both the nodes showing kibana server is not running.

first node -

systemctl status kibana -l

Apr 20 19:58:22  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:28:22Z","tags":["debug","monitoring","kibana-monitoring"],"pid":24219,"message":"Received Kibana Ops event data"}
Apr 20 19:58:22  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:28:22Z","tags":["plugin","debug"],"pid":24219,"message":"Checking Elasticsearch version"}
Apr 20 19:58:23  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:28:23Z","tags":["debug","http"],"pid":24219,"message":"Kibana server is not ready yet get:/login."}
Apr 20 19:58:25  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:28:25Z","tags":["debug","monitoring","kibana-monitoring"],"pid":24219,"message":"Received Kibana Ops event data"}
Apr 20 19:58:25  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:28:25Z","tags":["plugin","debug"],"pid":24219,"message":"Checking Elasticsearch version"}
Apr 20 19:58:25  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:28:25Z","tags":["debug","http"],"pid":24219,"message":"Kibana server is not ready yet get:/login."}

journalctl  -fu kibana

Apr 20 20:00:09  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:09Z","tags":["debug","stats-collection"],"pid":24219,"message":"Fetching data from sample-data collector"}
Apr 20 20:00:09  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:09Z","tags":["debug","stats-collection"],"pid":24219,"message":"Fetching data from kql collector"}
Apr 20 20:00:09  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:09Z","tags":["debug","stats-collection"],"pid":24219,"message":"Fetching data from localization collector"}
Apr 20 20:00:10  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:10Z","tags":["debug","monitoring","kibana-monitoring"],"pid":24219,"message":"Received Kibana Ops event data"}
Apr 20 20:00:10  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:10Z","tags":["debug","monitoring","kibana-monitoring"],"pid":24219,"message":"Received Kibana Ops event data"}
Apr 20 20:00:10  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:10Z","tags":["plugin","debug"],"pid":24219,"message":"Checking Elasticsearch version"}
Apr 20 20:00:11  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:11Z","tags":["debug","http"],"pid":24219,"message":"Kibana server is not ready yet get:/login."}
Apr 20 20:00:11  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:11Z","tags":["debug","http"],"pid":24219,"message":"Kibana server is not ready yet get:/login."}
Apr 20 20:00:12  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:12Z","tags":["debug","monitoring","kibana-monitoring"],"pid":24219,"message":"Received Kibana Ops event data"}
Apr 20 20:00:12  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:12Z","tags":["plugin","debug"],"pid":24219,"message":"Checking Elasticsearch version"}
Apr 20 20:00:13  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:13Z","tags":["debug","http"],"pid":24219,"message":"Kibana server is not ready yet get:/login.
Apr 20 20:00:19  kibana[24219]: {"type":"log","@timestamp":"2021-04-20T14:30:19Z","tags":["debug","stats-collection"],"pid":24219,"message":"All collectors are not ready (waiting for maps,visualization_types) but we have waited the required 60s and will return data from all collectors that are ready."}

Its showing all three e.s nodes properly.

curl -XGET 'http://localhost:5200/_cat/nodes'?v=true -u elastic
Enter host password for user 'elastic':
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
ip1           75          99   5    0.06    0.40     0.63 dilm      -      elasticsearch_1
ip2           8          98   4    0.24    0.20     0.22 dilm      -        elasticsearch_3
ip3          66          99   4    0.20    0.29     0.35 dilm      *      elasticsearch_2

es logs -

[2021-04-20T12:16:15,927][INFO ][o.e.l.LicenseService     ] [elasticsearch_1] license [cb23e374-5a9d-4f15-b2be-c9cd49b41f3b] mode [basic] - valid
[2021-04-20T12:16:15,928][INFO ][o.e.x.s.s.SecurityStatusChangeListener] [elasticsearch_1] Active license is now [BASIC]; Security is enabled
[2021-04-20T12:16:16,163][INFO ][o.e.h.AbstractHttpServerTransport] [elasticsearch_1] publish_address {10.191.156.153:5200}, bound_addresses {0.0.0.0:5200}
[2021-04-20T12:16:16,164][INFO ][o.e.n.Node               ] [elasticsearch_1] started

[2021-04-20T12:28:19,589][INFO ][o.e.c.c.Coordinator      ] [elasticsearch_1] master node [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:53
00}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}] failed, restarting discovery
org.elasticsearch.ElasticsearchException: node [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml
.max_open_jobs=20, xpack.installed=true}] failed [3] consecutive checks

Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:coordination/fault_detection/leader_check]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: non-leader rejecting leader check

[2021-04-20T12:28:19,599][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] master node changed {previous [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}], current []}, term: 106927, version: 3877, reason: becoming candidate: onLeaderFailure
[2021-04-20T12:28:19,840][INFO ][o.e.c.s.MasterService    ] [elasticsearch_1] elected-as-master ([2] nodes joined)[{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} elect leader, {elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 106929, version: 3878, reason: master node changed {previous [], current [{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}]}
[2021-04-20T12:28:20,041][WARN ][o.e.c.s.MasterService    ] [elasticsearch_1] failing [elected-as-master ([2] nodes joined)[{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} elect leader, {elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_]]: failed to commit cluster state version [3878]
org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: node is no longer master for term 106929 while handling publication

[2021-04-20T12:28:20,045][INFO ][o.e.c.c.JoinHelper       ] [elasticsearch_1] failed to join {elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} with JoinRequest{sourceNode={elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=106928, lastAcceptedTerm=106927, lastAcceptedVersion=3877, sourceNode={elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}}]}
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_1][10.191.156.153:5300][internal:cluster/coordination/join]

[2021-04-20T12:28:20,047][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] master node changed {previous [], current [{elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}]}, term: 106929, version: 3878, reason: ApplyCommitRequest{term=106929, version=3878, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}
[2021-04-20T12:28:22,304][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] removed {{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3881, reason: ApplyCommitRequest{term=106929, version=3881, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}
[2021-04-20T12:28:22,413][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [apm-7.4.0-2021.04.19][0] primary-replica resync completed with 0 operations
[2021-04-20T12:28:22,645][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [.monitoring-es-7-2021.04.14][0] primary-replica resync completed with 0 operations
[2021-04-20T12:28:23,013][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_1] [.monitoring-es-7-2021.04.20][0] primary-replica resync completed with 0 operations
[2021-04-20T12:29:24,586][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_1] added {{elasticsearch_3}{Nzkhpw9zQoO-m9vPi7yhqA}{k3ySDUTcQAKMh9Az0veQOw}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3892, reason: ApplyCommitRequest{term=106929, version=3892, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}

prat · April 20, 2021, 3:17pm

second node -

Apr 20 20:09:52 itfoobpnoneuapp4uat kibana[36793]: {"type":"log","@timestamp":"2021-04-20T14:39:52Z","tags":["debug","stats-collection"],"pid":36793,"message":"Fetching data from ui_metric collector"}
Apr 20 20:09:52 itfoobpnoneuapp4uat kibana[36793]: {"type":"log","@timestamp":"2021-04-20T14:39:52Z","tags":["debug","http"],"pid":36793,"message":"Kibana server is not ready yet get:/login."}

below command showing service unavailable .

curl -v -s http://127.0.0.1:8601/api/status
* About to connect() to 127.0.0.1 port 8601 (#0)
* Connected to 127.0.0.1 (127.0.0.1) port 8601 (#0)
>
< HTTP/1.1 503 Service Unavailable
< retry-after: 30
< content-type: text/html; charset=utf-8
< cache-control: no-cache
< content-length: 30
< Date: Tue, 20 Apr 2021 14:43:19 GMT
< Connection: keep-alive
<
* Connection #0 to host 127.0.0.1 left intact
Kibana server is not ready yet

es logs

[2021-04-20T12:24:35,185][INFO ][o.e.l.LicenseService     ] [elasticsearch_2] license [cb23e374-5a9d-4f15-b2be-c9cd49b41f3b] mode [basic] - valid
[2021-04-20T12:24:35,186][INFO ][o.e.x.s.s.SecurityStatusChangeListener] [elasticsearch_2] Active license is now [BASIC]; Security is enabled
[2021-04-20T12:24:35,231][INFO ][o.e.h.AbstractHttpServerTransport] [elasticsearch_2] publish_address {10.191.156.154:5200}, bound_addresses {0.0.0.0:5200}
[2021-04-20T12:24:35,232][INFO ][o.e.n.Node               ] [elasticsearch_2] started

[2021-04-20T12:27:33,508][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] master node changed {previous [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}], current []}, term: 106927, version: 3877, reason: becoming candidate: joinLeaderInTerm
[2021-04-20T12:27:33,627][INFO ][o.e.c.s.MasterService    ] [elasticsearch_2] elected-as-master ([2] nodes joined)[{elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} elect leader, {elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 106929, version: 3878, reason: master node changed {previous [], current [{elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}]}
[2021-04-20T12:27:33,903][INFO ][o.e.c.c.JoinHelper       ] [elasticsearch_2] failed to join {elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} with JoinRequest{sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=106928, lastAcceptedTerm=106927, lastAcceptedVersion=3877, sourceNode={elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch_1}{tadP0OIIQ3KZZF4BFK6_ig}{5SHQPjvyQzCcVDjv6zkj-g}{10.191.156.153}{10.191.156.153:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}}]}
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_1][10.191.156.153:5300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: node is no longer master for term 106929 while handling publication

[2021-04-20T12:27:33,906][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] master node changed {previous [], current [{elasticsearch_2}{pd_Nt9ZaTRuCgYwBxOvBRQ}{9ehJiNq3SbScQi21xwnqMA}{10.191.156.154}{10.191.156.154:5300}{dilm}{ml.machine_memory=16637550592, xpack.installed=true, ml.max_open_jobs=20}]}, term: 106929, version: 3878, reason: Publication{term=106929, version=3878}
[2021-04-20T12:27:34,008][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [elasticsearch_2] failed to execute on node [rpPhLWZzSWmVpyTWXl2PmQ]

org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][cluster:monitor/nodes/stats[n]]
Caused by: java.lang.IllegalStateException: environment is not locked

Caused by: java.nio.file.NoSuchFileException: /opt/elasticsearch/nodes/0/node.lock

[2021-04-20T12:27:34,042][WARN ][o.e.g.G.InternalReplicaShardAllocator] [elasticsearch_2] [metricbeat-7.4.0-2021.04.20][0]: failed to list shard for shard_store on node [rpPhLWZzSWmVpyTWXl2PmQ]
org.elasticsearch.action.FailedNodeException: Failed node [rpPhLWZzSWmVpyTWXl2PmQ]

Caused by: org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][internal:cluster/nodes/indices/shard/store[n]]
Caused by: java.lang.IllegalStateException: environment is not locked

[2021-04-20T12:27:34,052][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] failing shard [failed shard, shard [apm-7.4.0-2021.04.20][0], node[rpPhLWZzSWmVpyTWXl2PmQ], [R], s[STARTED], a[id=cz31swkJRyiHP457BbSmUg], message [failed to perform indices:data/write/bulk[s] on replica [apm-7.4.0-2021.04.20][0], node[rpPhLWZzSWmVpyTWXl2PmQ], [R], s[STARTED], a[id=cz31swkJRyiHP457BbSmUg]], failure [RemoteTransportException[[elasticsearch_3][10.191.156.155:5300][indices:data/write/bulk[s][r]]]; nested: AlreadyClosedException[translog is already closed]; nested: NoSuchFileException[/opt/elasticsearch/nodes/0/indices/s-IAyTIsTvqGRyborHLr_w/0/translog/translog.ckp]; ], markAsStale [true]]
org.elasticsearch.transport.RemoteTransportException: [elasticsearch_3][10.191.156.155:5300][indices:data/write/bulk[s][r]]
Caused by: org.apache.lucene.store.AlreadyClosedException: translog is already closed

2021-04-20T12:27:35,687][INFO ][o.e.c.s.MasterService    ] [elasticsearch_2] node-left[{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} followers check retry count exceeded], term: 106929, version: 3881, reason: removed {{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}
[2021-04-20T12:27:36,433][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] removed {{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3881, reason: Publication{term=106929, version=3881}
[2021-04-20T12:27:36,480][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [heartbeat-7.4.0-2021.04.19][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,487][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [filebeat-7.4.0-2021.04.15][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,492][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [filebeat-7.4.0-2021.04.17][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,501][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [metricbeat-7.4.0-2021.04.14][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,507][DEBUG][o.e.a.a.i.g.TransportGetIndexAction] [elasticsearch_2] connection exception while trying to forward request with action name [indices:admin/get] to master node [{elasticsearch_3}{rpPhLWZzSWmVpyTWXl2PmQ}{Fo5PE-vCQn-ZYQU-le_peg}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true}], scheduling a retry. Error: [org.elasticsearch.transport.NodeDisconnectedException: [elasticsearch_3][10.191.156.155:5300][indices:admin/get] disconnected]
[2021-04-20T12:27:36,511][INFO ][o.e.c.r.DelayedAllocationService] [elasticsearch_2] scheduling reroute for delayed shards in [59.1s] (27 delayed shards)
[2021-04-20T12:27:36,512][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [apm-7.4.0-2021.04.15][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,592][INFO ][o.e.i.s.IndexShard       ] [elasticsearch_2] [metricbeat-7.4.0-2021.04.16][0] primary-replica resync completed with 0 operations
[2021-04-20T12:27:36,636][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] [.monitoring-logstash-7-2021.04.20][0] marking unavailable shards as stale: [E7TKlgFnSXKAKtEftbGtVA]
[2021-04-20T12:27:36,882][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] [.monitoring-es-7-2021.04.20][0] marking unavailable shards as stale: [f5z8oZEuRIa85eOJ7Lh7jg]
[2021-04-20T12:28:36,140][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] [filebeat-7.4.0-2021.04.20][0] marking unavailable shards as stale: [L3YU4w2OSG2Nb05AK4bv_g]
[2021-04-20T12:28:36,246][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] [metricbeat-7.4.0-2021.04.19][0] marking unavailable shards as stale: [PqNnaeWEQQG2yXyEuY8sjw]
[2021-04-20T12:28:38,421][INFO ][o.e.c.s.MasterService    ] [elasticsearch_2] node-join[{elasticsearch_3}{Nzkhpw9zQoO-m9vPi7yhqA}{k3ySDUTcQAKMh9Az0veQOw}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true} join existing leader], term: 106929, version: 3892, reason: added {{elasticsearch_3}{Nzkhpw9zQoO-m9vPi7yhqA}{k3ySDUTcQAKMh9Az0veQOw}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}
[2021-04-20T12:28:39,344][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch_2] added {{elasticsearch_3}{Nzkhpw9zQoO-m9vPi7yhqA}{k3ySDUTcQAKMh9Az0veQOw}{10.191.156.155}{10.191.156.155:5300}{dilm}{ml.machine_memory=16637550592, ml.max_open_jobs=20, xpack.installed=true},}, term: 106929, version: 3892, reason: Publication{term=106929, version=3892}
[2021-04-20T12:28:47,413][WARN ][o.e.c.r.a.AllocationService] [elasticsearch_2] [metricbeat-7.4.0-2021.04.18][0] marking unavailable shards as stale: [KvxFD9TYTwSQ-Tjt4wkD1Q]

on third es node there are no logs after this,

[2021-04-20T12:28:51,664][INFO ][o.e.n.Node               ] [elasticsearch_3] started

Can you please check es logs and suggest what could be the issue?

flash1293 · April 21, 2021, 7:21am

I'm no expert on this unfortunately, but it seems like your Elasticsearch nodes are not fully operational:

Make sure there is an /opt/elasticsearch directory and it has the same permissions as the original one you moved away from the location.

prat · April 21, 2021, 11:44am

Thanks for your reply.

I have moved nodes directory from /opt/elasticsearch to other location.

Yes all three nodes have /opt/elasticsearch directory and their permission, ownership are same but it is different with moved nodes directory. Most if the different is in other's permission apart from node.lock.

below is sreen shot the data which got created after moving old data and restarting es.

drwxr-s--- 3 elasticsearch elasticsearch 4096 Apr 20 12:27 /opt/elasticsearch/

drwxr-sr-x 3 elasticsearch elasticsearch 4096 Apr 20 12:27 /opt/elasticsearch/nodes

drwxr-sr-x 4 elasticsearch elasticsearch 4096 Apr 20 12:28 0

drwxr-sr-x 48 elasticsearch elasticsearch 4096 Apr 21 05:30 indices
-rw-r--r--  1 elasticsearch elasticsearch    0 Apr 20 12:28 node.lock
drwxr-sr-x  2 elasticsearch elasticsearch 4096 Apr 21 12:15 _state

below is the screen shot of moved data.

drwxr-s--- 3 elasticsearch elasticsearch 4096 Apr 14 17:53 nodes

drwxr-s--- 4 elasticsearch elasticsearch 4.0K Apr 19 22:06 0

-rwxr-s---  1 elasticsearch elasticsearch    0 Apr 14 17:53 node.lock
drwxr-s--- 42 elasticsearch elasticsearch 4.0K Apr 20 05:30 indices
drwxr-s---  2 elasticsearch elasticsearch 4.0K Apr 20 12:27 _state

This path is not getting created by elasticsearch rpm so can't run command which sets rpm's default ownership and permission.

Do you know which one is correct?

flash1293 · April 21, 2021, 11:52am

Could you try moving all contents, not just "nodes" (leaving an empty data directory to work with)? I think you left that directory in a partial state and Elasticsearch doesn't really know what to do with it.
If it's empty it will start from scratch.

prat · April 21, 2021, 11:59am

Thank you for your prompt reply.

path.data is set to,
path.data: /opt/elasticsearch

/opt/elasticsearch only has nodes directory which I have already moved. Do you mean to move any other directory also?

flash1293 · April 21, 2021, 12:46pm

/opt/elasticsearch should be fully empty and writeable by the Elasticsearch user

prat · April 21, 2021, 1:09pm

@flash1293,

When nodes directory was moved /opt/elasticsearch was empty and permission were writable to elasticsearch user.

I think es is not having any issue now.

How can we further check for kibana issue, it is still giving messages as kibana server is not ready yet get: /login.

flash1293 · April 21, 2021, 1:59pm

If there are indices on the cluster starting with .kibana, please delete them, then try to start Kibana again. I think it would make sense to just start a single Kibana node, and wait until it's running before starting other nodes. Maybe different instances are somehow interfering.

prat · April 21, 2021, 2:14pm

Yes we performed the same and able to login to kibana. Thanks for your support.

Topic		Replies	Views
Kibana 6.2.3 Restarts Constantly Kibana	23	3936	May 23, 2018
Kibana service stops after few seconds Kibana	3	2851	January 2, 2019
Kibana - Request Timeout after 30000ms at /usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:355:15 Kibana	3	5536	July 4, 2019
Does kibana will restart if it can't connect to elasticsearh? Kibana	2	186	August 9, 2023
Restarting Kibana Kibana	6	25441	July 6, 2017

Kibana service restarting after few seconds

Related topics