Hello,
Thank you for the hint :). I also attach now the relative logs that the image included:
(The last line of the logs below include the FATAL ERROR)
kibana.service - Kibana
Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Thu 2020-04-23 15:26:22 CEST; 6min ago
Process: 5695 ExecStart=/usr/share/kibana/bin/kibana -c /etc/kibana/kibana.yml (code=exited, status=1/FAILURE)
Main PID: 5695 (code=exited, status=1/FAILURE)
Apr 23 15:26:19 sag-prd-es-003.sag.services systemd[1]: Unit kibana.service entered failed state.
Apr 23 15:26:19 sag-prd-es-003.sag.services systemd[1]: kibana.service failed.
Apr 23 15:26:22 sag-prd-es-003.sag.services systemd[1]: kibana.service holdoff time over, scheduling restart.
Apr 23 15:26:22 sag-prd-es-003.sag.services systemd[1]: Stopped Kibana.
Apr 23 15:26:22 sag-prd-es-003.sag.services systemd[1]: start request repeated too quickly for kibana.service
Apr 23 15:26:22 sag-prd-es-003.sag.services systemd[1]: Failed to start Kibana.
Apr 23 15:26:22 sag-prd-es-003.sag.services systemd[1]: Unit kibana.service entered failed state.
Apr 23 15:26:22 sag-prd-es-003.sag.services systemd[1]: kibana.service failed.
[root@sag-prd-es-003 log]# journalctl -fu kibana.service
-- Logs begin at Thu 2020-04-23 14:46:02 CEST. --
Apr 23 15:26:18 sag-prd-es-003.sag.services kibana[5695]: FATAL [illegal_argument_exception] Validation Failed: 1: this action would add [2] total shards, but this cluster curr ently has [2252]/[2000] maximum shards open; :: {"path":"/.kibana_task_manager_2","query":{},"body":"{\"mappings\":{\"dynamic\":\"strict\",\"properties\":{\"kibana\":{\"properti es\":{\"apiVersion\":{\"type\":\"integer\"},\"uuid\":{\"type\":\"keyword\"},\"version\":{\"type\":\"integer\"}}},\"task\":{\"properties\":{\"taskType\":{\"type\":\"keyword\"},\" scheduledAt\":{\"type\":\"date\"},\"runAt\":{\"type\":\"date\"},\"startedAt\":{\"type\":\"date\"},\"retryAt\":{\"type\":\"date\"},\"schedule\":{\"properties\":{\"interval\":{\"t ype\":\"keyword\"}}},\"attempts\":{\"type\":\"integer\"},\"status\":{\"type\":\"keyword\"},\"params\":{\"type\":\"text\"},\"state\":{\"type\":\"text\"},\"user\":{\"type\":\"keyw ord\"},\"scope\":{\"type\":\"keyword\"},\"ownerId\":{\"type\":\"keyword\"}}},\"type\":{\"type\":\"keyword\"},\"config\":{\"dynamic\":\"true\",\"properties\":{\"buildNum\":{\"typ e\":\"keyword\"}}},\"migrationVersion\":{\"dynamic\":\"true\",\"type\":\"object\"},\"namespace\":{\"type\":\"keyword\"},\"updated_at\":{\"type\":\"date\"},\"references\":{\"type \":\"nested\",\"properties\":{\"name\":{\"type\":\"keyword\"},\"type\":{\"type\":\"keyword\"},\"id\":{\"type\":\"keyword\"}}}},\"_meta\":{\"migrationMappingPropertyHashes\":{\"c onfig\":\"87aca8fdb053154f11383fce3dbf3edf\",\"migrationVersion\":\"4a1746014a75ade3a714e1db5763276f\",\"type\":\"2f4316de49999235636386fe51dc06c1\",\"namespace\":\"2f4316de4999 9235636386fe51dc06c1\",\"updated_at\":\"00da57df13e94e9d98437d13ace4bfe0\",\"references\":\"7997cf5a56cc02bdc9c93361bde732b0\",\"task\":\"235412e52d09e7165fac8a67a43ad6b4\"}}},\ "settings\":{\"number_of_shards\":1,\"auto_expand_replicas\":\"0-1\"}}","statusCode":400,"response":"{\"error\":{\"root_cause\":[{\"type\":\"illegal_argument_exception\",\"reaso n\":\"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2252]/[2000] maximum shards open;\"}],\"type\":\"illegal_argument_exception\" ,\"reason\":\"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2252]/[2000] maximum shards open;\"},\"status\":400}"}
So as it is noticed from the logs above it says: FATAL [illegal_argument_exception] Validation Failed: 1: this action would add [2] total shards, but this cluster curr ently has [2252]/[2000] maximum shards open; ::
Was that a data size issue? More specifically when i shut down the first node did all its stored data try to be handeld by the 2 other nodes but the size was so big that they couldnt??
I have 3 nodes with 500GB each , so when i shut down one of them the other 2 tried to take his data load (291.6 + 337.8 + 375.5 =1004.9>1000) and that led to failure?