ECK 1.0.0-beta1 during node startup pod goes to CrashLoopBackOff

Hi,

i have 6 node ECK cluster, when ever node reboots elastic 7.5 pod is not starting it goes in 'crashloop node'. If I delete the pod, then its starting properly. I am un shure why POD goes into crashloop.

# kgpw -w
NAME                         READY   STATUS                  RESTARTS   AGE   
elastic-operator-0           1/1     Running                 8          2d22h 
elk-prd-es-default-0         1/1     Running                 0          22h   
elk-prd-es-default-1         1/1     Running                 0          2d16h 
elk-prd-es-default-2         1/1     Running                 0          2d16h 
elk-prd-es-default-3         1/1     Running                 0          17s   
elk-prd-es-default-4         0/1     Init:CrashLoopBackOff   9          22h      
elk-prd-es-default-4         0/1     Init:1/3                10         22h   
elk-prd-es-default-4         0/1     Init:Error              10         22h   
elk-prd-es-default-4         0/1     Init:CrashLoopBackOff   10         22h   

Here is the events:

Events:
  Type     Reason          Age                   From                Message
  ----     ------          ----                  ----                -------
  Warning  BackOff         21m (x5767 over 21h)  kubelet, ecknode04  Back-off restarting failed container
  Normal   SandboxChanged  17m                   kubelet, ecknode04  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled          17m                   kubelet, ecknode04  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.5.0" already present on machine
  Normal   Created         17m                   kubelet, ecknode04  Created container elastic-internal-init-filesystem
  Normal   Started         17m                   kubelet, ecknode04  Started container elastic-internal-init-filesystem
  Normal   Pulled          17m (x4 over 17m)     kubelet, ecknode04  Container image "docker.elastic.co/elasticsearch/elasticsearch:7.5.0" already present on machine
  Normal   Created         17m (x4 over 17m)     kubelet, ecknode04  Created container elastic-internal-init-keystore
  Normal   Started         17m (x4 over 17m)     kubelet, ecknode04  Started container elastic-internal-init-keystore
  Warning  BackOff         2m50s (x77 over 17m)  kubelet, ecknode04  Back-off restarting failed container

any help to resolve this issue?

Hi @sfgroups1, if you look at the logs of the pod (docs here: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-troubleshooting.html#k8s-get-elasticsearch-logs) you may get a better idea of why it is crashing.

log message not showing any error. it says pod started. something else causing pod not get to ready status.

 kgp |grep elk-prd-es-default-4
elk-prd-es-default-4         0/1     Init:CrashLoopBackOff   80         28h

k logs elk-prd-es-default-4 |tail -1
{"type": "server", "timestamp": "2019-12-08T17:03:29,608Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "elk-prd", "node.name": "elk-prd-es-default-4", "message": "started", "cluster.uuid": "3d9MXxV2S4-226M5VR7cjA", "node.id": "LEnpjkmaSbm6ROM2XQIDSg"  }

node log show this error message.

Error syncing pod 16eff8e8-6b1a-4de7-b031-6af8d78ddb12 ("elk-prd-es-default-4_elastic-system(16eff8e8-6b1a-4de7-b031-6af8d78ddb12)"), skipping: failed to "StartContainer" for "elastic-internal-init-keystore" with CrashLoopBackOff: "back-off 5m0s restarting failed container=elastic-internal-init-keystore pod=elk-prd-es-default-4_elastic-system(16eff8e8-6b1a-4de7-b031-6af8d78ddb12)"

Can you post your Elasticsearch yaml manifest?
Also can you give us the output of the init container logs?

kubectl logs elk-prd-es-default-4 -c elastic-internal-init-keystore

Here is the output: unsure why its looking for terminal.

# kubectl logs elk-prd-es-default-0 -c elastic-internal-init-keystore
+ echo 'Initializing keystore.'
+ /usr/share/elasticsearch/bin/elasticsearch-keystore create
Initializing keystore.
Exception in thread "main" java.lang.IllegalStateException: unable to read from standard input; is standard input open and a tty attached?
        at org.elasticsearch.cli.Terminal$SystemTerminal.readText(Terminal.java:207)
        at org.elasticsearch.cli.Terminal.promptYesNo(Terminal.java:140)
        at org.elasticsearch.common.settings.CreateKeyStoreCommand.execute(CreateKeyStoreCommand.java:43)
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125)
        at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:77)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:125)
        at org.elasticsearch.cli.Command.main(Command.java:90)
        at org.elasticsearch.common.settings.KeyStoreCli.main(KeyStoreCli.java:41)

This looks like a wrong keystore init container command.
Can you please post your elasticsearch resource yaml manifest?

Here is the yaml file. I have this issue only during server restart, if I delete the crashloop pod, then pod starting properly.

apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
  name: elk-prd
spec:
  version: 7.5.0  
  nodeSets:
    - name: default
      count: 5
      config:
        node.master: true
        node.data: true
        node.ingest: true
        node.store.allow_mmap: false 
      podDisruptionBudget:
        spec:
          maxUnavailable: 2
          minAvailable: 3
          selector:
            matchLabels:
              elasticsearch.k8s.elastic.co/cluster-name: elk-prd
      volumeClaimTemplates:
        - metadata:
            name: elasticsearch-data
          spec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 490Gi
            storageClassName: local-storage