Killing pod

t0ffel · June 20, 2019, 6:36pm

I'm trying a simple 3-node cluster. All 3 are masters/data/ingest.

If I delete all 3 pods manually the pods will spin-up again, they will have the same names, however they'll fail to join the cluster. According to the logs they'll be looking for some completely different non-existent master nodes.
Is there a way to resolve this issue? i.e. to ensure that even in case of deletion the pod try to join the existing masters.

Where does the operator save information about the masters?

pebrc · June 24, 2019, 8:32pm

The operator itself does not store any information about masters permanently. Cluster membership is handled by Elasticsearch itself.

Are you deleting all three nodes of the cluster at once? If so the cluster has no chance of recovering. You can only remove (number_master_nodes/2)-1 at once see https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-discovery-adding-removing-nodes.html

t0ffel · June 24, 2019, 8:50pm

I'm removing pods, and they get re-created. I have PVC, so supposedly it's supposed to re-use the data from those PVCs, right?
Who keeps the information about what the suffix of the pod name should be?

Step-by-step scenario:

operator is spun up
3 pods for the ES cluster are running
Kill the 3 pods for the ES cluster (emulate cluster failure for example, or invasive cluster upgrade)
3 pods are spun up again with the same names, however none of them are working - each tries to connect to some fantom masters which don't exist anywhere.

Expectation:
Instead of step 4 the pods are spun back up, mount same volumes, and everything is back to normal.

sebgl · June 25, 2019, 7:39am

@t0ffel can you share your Elasticsearch cluster specification (the yaml file)? I'd like to double-check the PersistentVolumeClaims setup.

t0ffel · June 25, 2019, 1:41pm

here is the CR:

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: c8search
spec:
  version: "7.1.0"
  nodes:
  - config:
      # most Elasticsearch configuration parameters are possible to set, e.g:
      node.master: true
      node.data: true
      node.ingest: true
      #node.attr.attr_name: attr_value
      http.cors.enabled : true
      http.cors.allow-origin : "*"
      http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE

    podTemplate:
      metadata:
        labels:
          # additional labels for pods
          app: c8search
        annotations:
          prometheus.io/scrape: "false"
      spec:
        containers:
        - name: elasticsearch
          resources:
            # specify resource limits and requests
            limits:
              # by default, we will size the heap size of ES to half of the memory limit
              memory: 2Gi
              cpu: 1
    nodeCount: 3
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100G

sebgl · June 25, 2019, 3:15pm

If using ECK version 0.8 (or 0.8.1), you should name your volumeClaimeTemplates data. See the quickstart section about persistent storage. Otherwise these volumes won't be mapped to the actual Elasticsearch data directory, which will be using emptyDir (not persisted after pod deletion).

I guess you may have been misleaded by the doc for the master branch, in which we changed that volume name from data to elasticsearch-data. This will only apply starting ECK v0.9 (not released yet).

Sorry for the confusion, things should work better with

volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data

Topic		Replies	Views
ECK managed cluster and elasticsearch-node Elastic Cloud on Kubernetes (ECK)	2	487	November 4, 2022
Master not discovered exception with ELK 7 Elasticsearch	9	5512	August 4, 2019
After kubernetes node restart, data pods couldn't join the cluster again Elastic Cloud on Kubernetes (ECK)	2	1782	November 4, 2022
Index template lost after recreate es cluster in Kubernetes environment Elasticsearch	4	1259	May 26, 2017
Single master node cluster dies when master node dies Elastic Cloud on Kubernetes (ECK)	5	689	September 14, 2021

Killing pod

Related topics