GKE Upgrade and PDB

Vincent_Ngai · January 20, 2020, 2:27am

Hey Guys,

I am install my ECK on GKE
as you may know GKE provide a method for cluster auto upgrade or no matter as a normal upgrade for node pool (it does 1 by 1 in each zone)

if we setup PDB , it will follow the rules as max 1 hour

However i dont know if 1 hour is enough for the replica relocate to another know

What will happen to the case and do ECK correspond react for this ?

https://cloud.google.com/kubernetes-engine/docs/how-to/upgrading-a-cluster

sebgl · January 20, 2020, 8:23am

Hi Vincent,

ECK already sets up a PDB with a maximum of one Pod allowed to be taken down (see https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-pod-disruption-budget.html). I think that should be good enough for most cases where you don't run multiple Elasticsearch Pods per Kubernetes node.

dkow · January 20, 2020, 8:51am

Hi @Vincent_Ngai, thanks for you message.

As to the replica relocation. If you are using network attached storage the replica relocation wouldn't normally be needed - as pod disappears and gets recreated on another node, its PV follows it and gets reattached.

If you are using local storage and it gets lost during node upgrade, after new pod is up and running ES will replicate the missing state. As to how long it can take, it depends on the state size, network performance and cluster load.

Currently, we don't have any specific recommendations around this, but I've created https://github.com/elastic/cloud-on-k8s/issues/2448 to track it.

Vincent_Ngai · January 20, 2020, 9:16am

Yes I know it support us to set PDB
But as I said the gke cluster will not wait forever

Let say my cluster have 3 node( each physica node with 1 eck node) and I have 1 replica for my index and my PDB only allow 1 unavailable

If the gke got upgrade
It will then kill 1 of node
And ECK will start reassign the shard to another node (i suppose it will? )

In the time GKE will wait becox of PDB
However it will only wait 1 hour

Even dont know If the migration of the shard not complete
If over 1 hr, then
Is that means my data will have high chance lost?

Vincent_Ngai · January 20, 2020, 9:20am

I am using PD (persistent disk) in GKE
Which is zonal disk
Pod gone will fine be, reborn pod can monut back the disk
However I am not sure when will ECK do the migration

dkow · January 20, 2020, 10:17am

In this case, your cluster should be fine.

If you are using persistent disks the pod will be deleted and recreated on a different k8s node. As soon as this happens, PV on your persistent disk will be attached to the pod and ES will continue operating as normal. There is no migration to be done as data on that PD was not lost. If pod coming up takes some time, a data migration to a different pod might start, but as soon as new pod (with already existing PV) rejoins ES cluster, the migration will be cancelled.

Vincent_Ngai · January 20, 2020, 10:48am

Thx man really address my concern

dkow · January 20, 2020, 11:00am

Happy to help!

Topic		Replies	Views
PodDisruptionBudget was restarting Elastic Cloud on Kubernetes (ECK)	3	368	June 24, 2021
How to speed up move es node migration to another machine Elastic Cloud on Kubernetes (ECK)	4	456	May 18, 2022
Recommendation for upgrading underlying kubernetes nodes Elastic Cloud on Kubernetes (ECK)	6	1117	November 4, 2022
Does ECK support local persistent disks and is it a good idea? Elastic Cloud on Kubernetes (ECK)	4	1507	November 4, 2022
Elasticsearch upgrade stuck - Skipping deletion because of migrating Elastic Cloud on Kubernetes (ECK)	5	1218	November 4, 2022

GKE Upgrade and PDB

Related topics