Single master node cluster dies when master node dies

iojas · August 16, 2021, 2:04pm

I am using ECK to deploy an ES cluster. my setup has 1 master node and 3 data nodes. If for some reason master nodes dies it comes back up (thanks to statefulsets in kubernetes), but associates itself with a different cluster-ID. Thus it rejects the request of other data nodes who tries to join it.

but when I run an upgrade or something where kubernetes "safely" remove and bring back the master node cluster becomes healthy in a while.

I tried with 3 master nodes as well. but when I kill one node it is never able to join the existing cluster and the cluster goes to yellow state forever.

now my question is

what if some issue happens and the master node is not restarted safely how can I make sure my cluster formation is happening correctly.

Heres the yaml code for deploying ECK on my kubernetes cluster.

---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: sifter-elastic-data-factory
spec:
  version: 7.14.0
  nodeSets:
    - name: master
      count: 1
      config:
        node.master: true
        node.data: false
        node.ingest: false
      podTemplate:
        spec:
          initContainers:
            - name: sysctl
              securityContext:
                privileged: true
              command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
          containers:
            - name: elasticsearch
              resources:
                requests:
                  memory: 8Gi
                  cpu: 3000m
                limits:
                  memory: 8Gi
                  cpu: 3000m
              env:
                - name: ES_JAVA_OPTS
                  value: -Xms6g -Xmx6g
      volumeClaimTemplates:
        - metadata:
            name: elasticsearch-data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 50Gi
            storageClassName: ssd
    - name: data
      count: 3
      config:
        node.master: false
        node.data: true
        node.ingest: true
      podTemplate:
        spec:
          initContainers:
            - name: sysctl
              securityContext:
                privileged: true
              command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
          containers:
            - name: elasticsearch
              resources:
                requests:
                  memory: 8Gi
                  cpu: 3000m
                limits:
                  memory: 8Gi
                  cpu: 3000m
              env:
                - name: ES_JAVA_OPTS
                  value: -Xms6g -Xmx6g
      volumeClaimTemplates:
        - metadata:
            name: elasticsearch-data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 60Gi
            storageClassName: ssd
  http:
    service:
      spec:
        type: ClusterIP
    tls:
      selfSignedCertificate:
        disabled: true

warkolm · August 16, 2021, 10:00pm

Are you using persistent volumes for the master?

iojas · August 17, 2021, 12:28pm

Yes. persistent volume for master as well as for data. could this be an issue? @warkolm

iojas · August 17, 2021, 1:50pm

@warkolm I guess this was the issue. when I removed the persistent storage on the master node, the cluster always comes back up without losing data.

followup question, what is the downside of not using persistent storage with the master node?

warkolm · August 17, 2021, 11:05pm

Your masters need persistent storage - Node | Elasticsearch Guide [7.14] | Elastic

system · September 14, 2021, 11:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
When half more master node down, cluster can't work as normal Elastic Cloud on Kubernetes (ECK)	3	396	August 23, 2021
How to prevent kubernetes bring down more than 1 data node during the deployment Elasticsearch	7	256	August 23, 2021
ECK managed cluster and elasticsearch-node Elastic Cloud on Kubernetes (ECK)	2	462	November 4, 2022
ECK 0.8.0 keeps creating and deleting nodes Elastic Cloud on Kubernetes (ECK)	6	514	November 4, 2022
How can i stop a cluster on ECK Elastic Cloud on Kubernetes (ECK)	5	1844	October 6, 2021

Single master node cluster dies when master node dies

Related topics