Hi, I have EKF stack on Kubernetes, now I have 1 client node, 1 master node and 3 data nodes.
When my master node restarts I lose all the data and indices that I have and my master's UUID changes, so I need to restart and the other nodes after so they can connect to the master again.
My questions are:
- If I add 2 more master-eligible nodes will it help me fix this problem, so I don't lose my data when my master node restarts and I don't need to restart the other nodes so they can connect to the master?
- If that is the solution how can I add 2 more master-eligible nodes to my cluster?
- If that is not the solution to this problem, can you please help me how to solve that?
Having 3 master eligible nodes will allow your cluster to continue operating even if one master node disappears and will allow you to perform rolling restarts and upgrades. Also note that dedicated master nodes and data nodes must have persistent storage, which it sounds your master node does not.
Yes my master node doesn't have persistent storage, thank you I will add now.
And how can I add one more master node, so when this one disappears my cluster can continue operating?
I am using Deployment, ConfigMap and Service to deploy the master node, I just need to add replicas: 2 in my Deployment or it needs to be done some other way?
You need a total of 3 dedicated master nodes. Two is insufficient.
Okay, I will add 2 more, so I have 3 master nodes in total, but how can I do that.
When I tried to add one more by changing replicas: 1 to replicas: 2, this was in the logs of the newly created master node:
"WARN", "message":"This node is a fully-formed single-node cluster with cluster UUID [g2UGQU6JS0CFmjOjSXZdqg], but it is configured as if to discover other nodes and form a multi-node cluster via the [discovery.seed_hosts=[elasticsearch-master, elasticsearch-data, elasticsearch-client]] setting. Fully-formed clusters do not attempt to discover other nodes, and nodes with different cluster UUIDs cannot belong to the same cluster. The cluster UUID persists across restarts and can only be changed by deleting the contents of the node's data path(s). Remove the discovery configuration to suppress this message."
This is the log I have in my master node after I restart it, and now the data nodes cannot join until I restart them too
"[elasticsearch-data][192.168.217.89:9300][internal:cluster/coordination/join/validate]","error.stack_trace":"org.elasticsearch.transport.RemoteTransportException: [elasticsearch-data][192.168.217.89:9300][internal:cluster/coordination/join/validate]\nCaused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: This node previously joined a cluster with UUID [dFD-TvY3ShWKmVStpjoouQ] and is now trying to join a different cluster with UUID [E96r4lD3TBmftdzjhNPTRQ]. This is forbidden and usually indicates an incorrect discovery or cluster bootstrapping configuration. Note that the cluster UUID persists across restarts and can only be changed by deleting the contents of the node's data paths  which will also remove any data held by this node.
Have a look at this guide:
I need to remove
cluster.initial_master_nodes from configuration of all my nodes.
But because I already have restarted my master node and it has new cluster UUID, first I need to restart the other nodes so they can join and after that to delete
cluster.initial_master_nodes from configuration of all my nodes. Is that right?
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.