Cannot use SecureSettings and availably zone awareness at the same time

Hi there,

I try to use the SecureSettings section to hold the credentials for the AWS S3 snapshot repo. But the k8s pod failed at the elastic-internal-init-keystore container step. The following is the error message:

% kubectl logs il-elasticsearch-service-next-es-data-d-2 -c elastic-internal-init-keystore
+ keystore_initialized_flag=/usr/share/elasticsearch/config/elastic-internal-init-keystore.ok
+ [[ -f /usr/share/elasticsearch/config/elastic-internal-init-keystore.ok ]]
+ echo 'Initializing keystore.'
+ /usr/share/elasticsearch/bin/elasticsearch-keystore create
Initializing keystore.
Exception in thread "main" java.lang.IllegalArgumentException: Could not resolve placeholder 'ZONE'
	at org.elasticsearch.common.settings.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:105)
	at org.elasticsearch.common.settings.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:58)
	at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1167)
	at org.elasticsearch.common.settings.Settings$Builder.replacePropertyPlaceholders(Settings.java:1123)
	at org.elasticsearch.node.InternalSettingsPreparer.initializeSettings(InternalSettingsPreparer.java:97)
	at org.elasticsearch.node.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:79)
	at org.elasticsearch.cli.EnvironmentAwareCommand.createEnv(EnvironmentAwareCommand.java:89)
	at org.elasticsearch.cli.EnvironmentAwareCommand.createEnv(EnvironmentAwareCommand.java:80)
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:75)
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:116)
	at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:80)
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:116)
	at org.elasticsearch.cli.Command.main(Command.java:79)
	at org.elasticsearch.common.settings.KeyStoreCli.main(KeyStoreCli.java:32)

I also enable the availiability zone awareness for the nodes via the following two lines. It seems like the environment variable ZONE is not set in the elastic-internal-init-keystore container which initialize the secret store. And I am not able to add that environment variable to the elastic-internal-init-keystore container from my YAML.

// node.attr.zone: ${ZONE}
// cluster.routing.allocation.awareness.attributes: k8s_node_name,zone

The following is a simplified version of my YAML.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: il-elasticsearch-service-next
  annotations:
    eck.k8s.elastic.co/downward-node-labels: "topology.kubernetes.io/zone"
spec:
  version: 7.13.4
  image: registry.deployment.intralinks.com:5000/il/il-search-base-images-es-rbl:7.13.4-47_g2a93cbb
  http:
    service:
      spec:
        type: ClusterIP
        selector:
          elasticsearch.k8s.elastic.co/cluster-name: il-elasticsearch-service-next
  secureSettings:
    - secretName: il-elasticsearch-service-next-es-snapshot-aws-s3-access
    - secretName: il-elasticsearch-service-next-es-snapshot-aws-s3-secret
  nodeSets:
    - name: all-in-one
      count: 3
      config:
        node.roles: ["master", "data"]
        node.attr.zone: ${ZONE}
        cluster.routing.allocation.awareness.attributes: k8s_node_name,zone
        xpack.monitoring.collection.enabled: true
        xpack.monitoring.elasticsearch.collection.enabled: false
      
      volumeClaimTemplates:
        - metadata:
            name: elasticsearch-data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
            storageClassName: general
      
      podTemplate:
        spec:
          initContainers:
            - name: sysctl
              securityContext:
                privileged: true
              command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
          containers:
            - name: elasticsearch
              env:
                - name: ZONE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.annotations['topology.kubernetes.io/zone']
              resources:
                requests:
                  memory: 4Gi
                  cpu: 2000m
                limits:
                  memory: 4Gi
                  cpu: 2000m
          topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: DoNotSchedule
              labelSelector:
              matchLabels:
                elasticsearch.k8s.elastic.co/cluster-name: il-elasticsearch-service-next
                elasticsearch.k8s.elastic.co/statefulset-name: il-elasticsearch-service-next-es-all-in-one

This a known issue, the documentation from the main development branch contains an example that includes the environment variables setting to fix it:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: il-elasticsearch-service-next
  annotations:
    eck.k8s.elastic.co/downward-node-labels: "topology.kubernetes.io/zone"
spec:
  version: 7.13.4
  image: registry.deployment.intralinks.com:5000/il/il-search-base-images-es-rbl:7.13.4-47_g2a93cbb
  http:
    service:
      spec:
        type: ClusterIP
        selector:
          elasticsearch.k8s.elastic.co/cluster-name: il-elasticsearch-service-next
  secureSettings:
    - secretName: il-elasticsearch-service-next-es-snapshot-aws-s3-access
    - secretName: il-elasticsearch-service-next-es-snapshot-aws-s3-secret
  nodeSets:
    - name: all-in-one
      count: 3
      config:
        node.roles: ["master", "data"]
        node.attr.zone: ${ZONE}
        cluster.routing.allocation.awareness.attributes: k8s_node_name,zone
        xpack.monitoring.collection.enabled: true
        xpack.monitoring.elasticsearch.collection.enabled: false

      volumeClaimTemplates:
        - metadata:
            name: elasticsearch-data
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
            storageClassName: general

      podTemplate:
        spec:
          initContainers:
            - name: sysctl
              securityContext:
                privileged: true
              command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
            - name: elastic-internal-init-keystore
              env:
                - name: ZONE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.annotations['topology.kubernetes.io/zone']
          containers:
            - name: elasticsearch
              env:
                - name: ZONE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.annotations['topology.kubernetes.io/zone']
              resources:
                requests:
                  memory: 4Gi
                  cpu: 2000m
                limits:
                  memory: 4Gi
                  cpu: 2000m
          topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: topology.kubernetes.io/zone
              whenUnsatisfiable: DoNotSchedule
              labelSelector:
              matchLabels:
                elasticsearch.k8s.elastic.co/cluster-name: il-elasticsearch-service-next
                elasticsearch.k8s.elastic.co/statefulset-name: il-elasticsearch-service-next-es-all-in-one

The problem gets resolved. Thank you Michael for your help. I appreciate it.