How to safely remove a worker node from TKGM clusters
Environment Details
Tanzu Cluster Details
tanzu cluster list
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN
wph-wld-rp01 default running 1 /1 3 /3 v1.21.2+vmware.1 <none> dev
kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01
Switched to context "wph-wld-rp01-admin@wph-wld-rp01" .
kubectl get nodes
NAME STATUS ROLES AGE VERSION
wph-wld-rp01-control-plane-c7mxm Ready control-plane,master 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw Ready <none> 3m20s v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb Ready <none> 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5 Ready <none> 3m40s v1.21.2+vmware.1
# From Management cluster context
kubectl config use-context ph-mgmt-rp01-admin@ph-mgmt-rp01
Switched to context "ph-mgmt-rp01-admin@ph-mgmt-rp01" .
kubectl get machines
NAME PROVIDERID PHASE VERSION
wph-wld-rp01-control-plane-c7mxm vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw vsphere://423cb203-15e6-e024-fb5c-fa62e555defa Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74 Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5 vsphere://423c4715-64ac-decc-7634-522888597e45 Running v1.21.2+vmware.1
Safely Removing a worker nodes
Select the worker node for deletion. In this example - wph-wld-rp01-md-0-64fc56fb95-mvbj5
Switch to workload cluster context
kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01
Drain the node
Drain the nodes using kubectl drain
Depending on the workload on this node other options may be needed. The two other frequently used options are
delete-local-data
- Continue even if there are pods using emptyDir (local data that will be deleted when the node is drained)
force
- Continue even if there are pods that do not declare a controller.
Other drain options
kubectl drain wph-wld-rp01-md-0-64fc56fb95-mvbj5 --ignore-daemonsets
node/wph-wld-rp01-md-0-64fc56fb95-mvbj5 already cordoned
Warning: ignoring DaemonSet-managed Pods: kube-system/calico-node-r5sfv, kube-system/kube-proxy-52pdn, kube-system/vsphere-csi-node-nq4xd
node/wph-wld-rp01-md-0-64fc56fb95-mvbj5 drained
Make sure scheduling is disabled
kubectl get nodes
NAME STATUS ROLES AGE VERSION
wph-wld-rp01-control-plane-c7mxm Ready control-plane,master 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw Ready <none> 12m v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb Ready <none> 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5 Ready,SchedulingDisabled <none> 12m v1.21.2+vmware.1
Delete the node
kubectl delete node wph-wld-rp01-md-0-64fc56fb95-mvbj5
node "wph-wld-rp01-md-0-64fc56fb95-mvbj5" deleted
Observing changes in the management cluster
kubectl delete node
on workload cluster will trigger machine object deletion as well
As seen from the output below the older machine object is deleted and the new machine `wph-wld-rp01-md-0-64fc56fb95-wqt7d`` is being provisioned now
kubectl config use-context ph-mgmt-rp01-admin@ph-mgmt-rp01
Switched to context "ph-mgmt-rp01-admin@ph-mgmt-rp01" .
kubectl get machines
NAME PROVIDERID PHASE VERSION
wph-wld-rp01-control-plane-c7mxm vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw vsphere://423cb203-15e6-e024-fb5c-fa62e555defa Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74 Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d Provisioning v1.21.2+vmware.1
# Provisioning Complete
kubectl get machines
NAME PROVIDERID PHASE VERSION
wph-wld-rp01-control-plane-c7mxm vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw vsphere://423cb203-15e6-e024-fb5c-fa62e555defa Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74 Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d vsphere://423cf20b-d028-993d-f452-f9c303e98ce5 Running v1.21.2+vmware.1
Verify new worker node is added
New node wph-wld-rp01-md-0-64fc56fb95-wqt7d
is added with the same id as the newly spun up machine wph-wld-rp01-md-0-64fc56fb95-wqt7d
from the output above
kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01
kubectl get nodes
NAME STATUS ROLES AGE VERSION
wph-wld-rp01-control-plane-c7mxm Ready control-plane,master 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw Ready <none> 19m v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb Ready <none> 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d Ready <none> 4m53s v1.21.2+vmware.1
Alternative Approach