Killing pod

I'm trying a simple 3-node cluster. All 3 are masters/data/ingest.

If I delete all 3 pods manually the pods will spin-up again, they will have the same names, however they'll fail to join the cluster. According to the logs they'll be looking for some completely different non-existent master nodes.
Is there a way to resolve this issue? i.e. to ensure that even in case of deletion the pod try to join the existing masters.

Where does the operator save information about the masters?

The operator itself does not store any information about masters permanently. Cluster membership is handled by Elasticsearch itself.

Are you deleting all three nodes of the cluster at once? If so the cluster has no chance of recovering. You can only remove (number_master_nodes/2)-1 at once see https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-discovery-adding-removing-nodes.html

I'm removing pods, and they get re-created. I have PVC, so supposedly it's supposed to re-use the data from those PVCs, right?
Who keeps the information about what the suffix of the pod name should be?

Step-by-step scenario:

  1. operator is spun up
  2. 3 pods for the ES cluster are running
  3. Kill the 3 pods for the ES cluster (emulate cluster failure for example, or invasive cluster upgrade)
  4. 3 pods are spun up again with the same names, however none of them are working - each tries to connect to some fantom masters which don't exist anywhere.

Expectation:
Instead of step 4 the pods are spun back up, mount same volumes, and everything is back to normal.

@t0ffel can you share your Elasticsearch cluster specification (the yaml file)? I'd like to double-check the PersistentVolumeClaims setup.

here is the CR:

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: c8search
spec:
  version: "7.1.0"
  nodes:
  - config:
      # most Elasticsearch configuration parameters are possible to set, e.g:
      node.master: true
      node.data: true
      node.ingest: true
      #node.attr.attr_name: attr_value
      http.cors.enabled : true
      http.cors.allow-origin : "*"
      http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE

    podTemplate:
      metadata:
        labels:
          # additional labels for pods
          app: c8search
        annotations:
          prometheus.io/scrape: "false"
      spec:
        containers:
        - name: elasticsearch
          resources:
            # specify resource limits and requests
            limits:
              # by default, we will size the heap size of ES to half of the memory limit
              memory: 2Gi
              cpu: 1
    nodeCount: 3
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100G

If using ECK version 0.8 (or 0.8.1), you should name your volumeClaimeTemplates data. See the quickstart section about persistent storage. Otherwise these volumes won't be mapped to the actual Elasticsearch data directory, which will be using emptyDir (not persisted after pod deletion).

I guess you may have been misleaded by the doc for the master branch, in which we changed that volume name from data to elasticsearch-data. This will only apply starting ECK v0.9 (not released yet).

Sorry for the confusion, things should work better with

volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data