Update strategy and Persistent Volumes for Elastic Cloud on Kubernetes

Chirica_Cosmin · August 21, 2024, 4:40pm

Hello,

I am working to setup an ECK cluster on GCP that should contain 5 Node sets:

3 master nodes
2 data nodes

I am using Terraform to deploy this to GCP , Helm chart to install the Elastic Operator and kubernetes_manifest resource to define Elasticsearch Cluster.

At this moment I have two issues that I am not able to find a solution too.

Update strategy is not working as expected.

As you can see in the manifest I am setting the changeBudget in order to use a RollingUpdate strategy to my cluster but it seems is not working. In order to test this I just increase or decrease pods resources and as a result all cluster pods are moved to Terminating state and then the cluster is recreated. I would expect that pods to do a rolling update considering my changeBudget settings as I don't want to have any downtime to my system.

Persistent Volumes gets deleted

In addition, on situations like previous one when updating pods resource and all pods are recreated also the data is lost. What should be done so that PersistentVolumes to actually persist and to be reallocated to the new pods?

This is the terraform file:

resource "helm_release" "elastic" {
  name             = "elastic-operator"
  namespace        = local.namespace
  repository       = "https://helm.elastic.co"
  chart            = "eck-operator"
  version          = "2.14.0"
  recreate_pods    = false
  replace          = false 
}

# Delay of 30s and 120s to wait until ECK operator is up and running
resource "time_sleep" "wait_30_seconds" {
  create_duration = "30s"

  depends_on = [helm_release.elastic]
}

resource "time_sleep" "wait_120_seconds" {
  create_duration = "120s"

  depends_on = [helm_release.elastic]
}

resource "kubernetes_manifest" "elasticsearch" {
  manifest = {
    apiVersion = "elasticsearch.k8s.elastic.co/v1"
    kind = "Elasticsearch"
    # Cluster metadata
    metadata = {
      "name" = "datariver"
      "namespace" = local.namespace
    }
    spec = {
      version = "8.15.0"
      nodeSets = [
        {
          name = "master-node"
          count = "3"
          config = {
            "node.roles" = ["master"]
            "node.store.allow_mmap" = false
          }
          podTemplate = {
            # Set virtual memeory for production node as in the link below
            # https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-virtual-memory.html
            spec = {
              securityContext = {
                  fsGroup = 1000
                  runAsUser = 1000
              }
              # updateStrategy = {
              #   type = "RollingUpdate"
              # }
              initContainers = [
                {
                  # Using an Init Container to set virtual memory
                  # https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-virtual-memory.html
                  name = "sysctl"
                  command = ["sh", "-c", "sysctl -w vm.max_map_count=262144"]
                  securityContext = {
                    privileged = true
                    runAsUser = 0
                    runAsGroup = 0
                  }
                },
                # Repurpose node command should be used to clenup data
                # {
                #     "name" = "repurpose-node"
                #     "image" = "docker.elastic.co/elasticsearch/elasticsearch:8.14.3"
                #     "command" = ["sh", "-c", "yes | ./bin/elasticsearch-node repurpose -v && chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data"]
                #     "volumeMounts" = [
                #         {
                #             "name" = "elasticsearch-data"
                #             "mountPath" = "/usr/share/elasticsearch/data"
                #         }
                #     ]
                #     "securityContext" = {
                #         "privileged" = true
                #         "runAsUser" = 0
                #         "runAsGroup" = 0
                #     }
                # }
              ]
              containers = [
                {
                  name = "elasticsearch"
                  securityContext = {
                    readOnlyRootFilesystem = false
                    runAsUser = 1000
                    runAsGroup = 1000
                  }
                  resources = {
                    requests = {
                      memory = "2Gi"
                      cpu = "2"
                    }
                    limits = {
                      memory = "2Gi"
                      cpu = "4"
                    }
                  }
                }
              ]
            }
          }
          volumeClaimTemplates = [
            {
              metadata = {
                name = "elasticsearch-data"
              }
              spec = {
                accessModes = ["ReadWriteOnce"]
                resources = {
                  requests = {
                    storage = "10Gi"
                  }
                }
                storageClassName = "standard-rwo"
              }
            }
          ]
        },
        {
          name = "data-node"
          count = "3"
          config = {
            "node.roles" = ["data", "ingest"]
            "node.store.allow_mmap" = false
          }
          podTemplate = {
            # Set virtual memeory for production node as in the link below
            # https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-virtual-memory.html
            spec = {
              securityContext = {
                fsGroup = 1000
                runAsUser = 1000
              }
              # updateStrategy = {
              #   type = "RollingUpdate"
              # }
              initContainers = [
                {
                  # Using an Init Container to set virtual memory
                  # https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-virtual-memory.html
                  name = "sysctl"
                  securityContext = {
                    privileged = true
                    runAsUser = 0
                    runAsGroup = 0
                  }
                  command = ["sh", "-c", "sysctl -w vm.max_map_count=262144"]
                }
              ]
              containers = [
                {
                  name = "elasticsearch"
                  securityContext = {
                    readOnlyRootFilesystem = false
                    runAsUser = 1000
                    runAsGroup = 1000
                  }
                  resources = {
                    requests = {
                      memory = "2Gi"
                      cpu = "2"
                    }
                    limits = {
                      memory = "4Gi"
                      cpu = "4"
                    }
                  }
                }
              ]
            }
          }
          volumeClaimTemplates = [
            {
              metadata = {
                name = "elasticsearch-data"
              }
              spec = {
                accessModes = ["ReadWriteOnce"]
                resources = {
                  requests = {
                    storage = "30Gi"
                  }
                }
                storageClassName = "standard-rwo"
              }
            }
          ]
        }
      ]
      # Update strategy
      # https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-update-strategy.html
      updateStrategy = {
        changeBudget = {
          maxSurge = 1
          maxUnavailable = 1
        }
      }
    }
  }

  field_manager {
    force_conflicts = true
  }

  depends_on = [helm_release.elastic, time_sleep.wait_30_seconds]
}

# Create Kibana manifest
resource "kubernetes_manifest" "kibana" {
  manifest = {
    apiVersion = "kibana.k8s.elastic.co/v1"
    kind = "Kibana"
    metadata = {
      name = "datariver"
      namespace = local.namespace
    }
    spec = {
      version = "8.15.0"
      count = "1"
      elasticsearchRef = {
        name = "datariver"
      }
    }
  }

  depends_on = [helm_release.elastic, kubernetes_manifest.elasticsearch, time_sleep.wait_120_seconds]
}

I would really appreciate some help to setup my cluster to use the RollingUpdate strategy and also set PersistentVolumes attached to my pods.

Thanks

Topic		Replies	Views
Upgrade ES version managed by ECK on terraform - all pods terminated Elastic Cloud on Kubernetes (ECK)	7	1512	January 12, 2023
Bug in eck operator? Cluster upgrade fails (under terraform) Elastic Cloud on Kubernetes (ECK)	2	685	January 12, 2023
ECK with NFS shared persistent volume Elastic Cloud on Kubernetes (ECK)	4	3532	November 4, 2022
ECK 1.0 with on premise K8s Elastic Cloud on Kubernetes (ECK)	2	952	November 4, 2022
Elastic Search, not able to modify the settings Elastic Cloud on Kubernetes (ECK)	5	985	March 25, 2021

Update strategy and Persistent Volumes for Elastic Cloud on Kubernetes

Related topics