How to deploy elastic on k8s to avoid join validation on cluster state with a different cluster uuid error

yuecong · July 17, 2019, 2:25am

I am trying to deploy elastic stack on top of k8s with 2 client nodes, 3 master nodes and 3 data nodes. And all of them are configured on a different node now.

For the 2 client nodes. I am using k8s deployment, and for master nodes and data nodes I am using k8s statefulset. it worked for the first time if all nodes are fresh, but when I updated the master nodes's statedulset, I got the error as

Blockquote
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid gR1WEwUpRXynUdhOyF2axA than local cluster uuid 1gwMWNx0TPSd3j9-hxlgcA, rejecting

Here is the related k8s deployment for each node

A common setting for all the 3 type of nodes, but changed the setting by controlling the env.

apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch
namespace: elasticsearch
labels:
app: elasticsearch

data:
elasticsearch.yml: |-
cluster:
name: ${CLUSTER_NAME}
initial_master_nodes: "es-master-0,es-master-1,es-master-2"

node:
  master: ${NODE_MASTER}
  data: ${NODE_DATA}
  name: ${NODE_NAME}
  ingest: ${NODE_INGEST}
  max_local_storage_nodes: 1
  attr.box_type: hot
processors: ${PROCESSORS:1}
 network.host: ${NETWORK_HOST}
 path:
  data: /usr/share/elasticsearch/data
  logs: /usr/share/elasticsearch/logs
 http:
  compression: true
 discovery:
  seed_hosts: ${DISCOVERY_SERVICE}

master node:
a headless service for master nodes

apiVersion: v1
kind: Service
metadata:
name: elasticsearch-discovery
namespace: elasticsearch
labels:
component: elasticsearch
role: master
spec:
selector:
component: elasticsearch
role: master
ports:

name: transport
port: 9300
protocol: TCP
clusterIP: None

configs for master node

  - name: elasticsearch
    env:
    - name: CLUSTER_NAME
      value: logs001
    - name: NUMBER_OF_MASTERS
      value: "3"
    - name: NODE_MASTER
      value: "true"
    - name: NODE_INGEST
      value: "false"
    - name: NODE_DATA
      value: "false"
    - name: NETWORK_HOST
      value: "0.0.0.0"
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: DISCOVERY_SERVICE
      value: elasticsearch-discovery
    - name: KUBERNETES_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace
    - name: PROCESSORS
      valueFrom:
        resourceFieldRef:
          resource: limits.cpu
    - name: ES_JAVA_OPTS
      value: -Xms48g -Xmx48g

configs for data nodes

  env:
    - name: CLUSTER_NAME
      value: logs001
    - name: NODE_MASTER
      value: "false"
    - name: NODE_INGEST
      value: "false"
    - name: NETWORK_HOST
      value: "_eth0_"
    - name: NUMBER_OF_MASTERS
      value: "3"
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: DISCOVERY_SERVICE
      value: elasticsearch-discovery
    - name: KUBERNETES_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace
    - name: NODE_DATA
      value: "true"
    - name: PROCESSORS
      valueFrom:
        resourceFieldRef:
          resource: limits.cpu
    - name: ES_JAVA_OPTS
      value: -Xms48g -Xmx48g

configs for client node

   env:
    - name: CLUSTER_NAME
      value: logs001
    - name: NUMBER_OF_MASTERS
      value: "3"
    - name: NODE_MASTER
      value: "false"
    - name: NODE_INGEST
      value: "true"
    - name: NODE_DATA
      value: "false"
    - name: NETWORK_HOST
      value: "_eth0_"
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: DISCOVERY_SERVICE
      value: elasticsearch-discovery
    - name: KUBERNETES_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace
    - name: PROCESSORS
      valueFrom:
        resourceFieldRef:
          resource: limits.cpu
    - name: ES_JAVA_OPTS
      value: -Xms6g -Xmx6g

DavidTurner · July 17, 2019, 5:18am

I think this means that your master nodes all restarted at once and were not using persistent storage, so they lost the cluster metadata. Your master nodes must use storage that persists across restarts.

yuecong · July 17, 2019, 5:22am

Thanks for the reply.
master node is using k8s statefulset, so it only have one pod restarted at one time.
ButI am not using persistent storage for master node. Let me check whether the issue will get resolved if I adds persistent disks for master nodes. btw, I think master node only has meta data, so is it ok a master node only use small amount of disks?

DavidTurner · July 17, 2019, 5:28am

I think this is not the case, or else the config you quote above is not the one that Elasticsearch is using.

Yes, they normally need less storage than data nodes.

yuecong · July 17, 2019, 5:36am

Appreciate your insights here

Here is the sts for the master node.

apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
component: elasticsearch
role: master
name: es-master
namespace: elasticsearch
spec:
serviceName: elasticsearch-master
replicas: 3 # Number of Elasticsearch master nodes to deploy
selector:
matchLabels:
component: elasticsearch
role: master
template:
metadata:
labels:
component: elasticsearch
role: master

Could you also suggest whether by using k8s's statefulset is sufficient? Or should I even wait for one master node is fully back to form a cluster?

DavidTurner · July 17, 2019, 6:54am

Sorry, I'm not the best person to help with K8s-specific questions. I believe it's possible to use a statefulset, yes, although you might prefer to use the Elasticsearch operator.

DavidTurner · July 17, 2019, 6:54am

I don't really understand this question in the context of Kubernetes, but in general you should try to avoid restarting more than one node at once.

yuecong · July 17, 2019, 7:00am

Thanks

yuecong · July 17, 2019, 2:38pm

This issue is solved as @DavidTurner suggested like a charm! Thanks

system · August 14, 2019, 2:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Master and data nodes have different cluster UUIDs Elasticsearch	5	343	July 19, 2022
Master not discovered exception with ELK 7 Elasticsearch	9	5519	August 4, 2019
Master takes another cluster UUID when recovers Elasticsearch	7	582	July 4, 2021
Elastic search not able to synch to each other in statefulset Elasticsearch	1	480	November 8, 2019
After kubernetes node restart, data pods couldn't join the cluster again Elastic Cloud on Kubernetes (ECK)	2	1796	November 4, 2022

How to deploy elastic on k8s to avoid join validation on cluster state with a different cluster uuid error

Related topics