How to safely remove a worker node from TKGM clusters
Environment Details
Tanzu Cluster Details
tanzu cluster list
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN
wph-wld-rp01 default running 1/1 3/3 v1.21.2+vmware.1 <none> dev
kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01
Switched to context "wph-wld-rp01-admin@wph-wld-rp01".
kubectl get nodes
NAME STATUS ROLES AGE VERSION
wph-wld-rp01-control-plane-c7mxm Ready control-plane,master 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw Ready <none> 3m20s v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb Ready <none> 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5 Ready <none> 3m40s v1.21.2+vmware.1
# From Management cluster context
kubectl config use-context ph-mgmt-rp01-admin@ph-mgmt-rp01
Switched to context "ph-mgmt-rp01-admin@ph-mgmt-rp01".
kubectl get machines
NAME PROVIDERID PHASE VERSION
wph-wld-rp01-control-plane-c7mxm vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw vsphere://423cb203-15e6-e024-fb5c-fa62e555defa Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74 Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5 vsphere://423c4715-64ac-decc-7634-522888597e45 Running v1.21.2+vmware.1
Safely Removing a worker nodes
- Select the worker node for deletion. In this example -
wph-wld-rp01-md-0-64fc56fb95-mvbj5
Switch to workload cluster context
kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01
Drain the node
- Drain the nodes using
kubectl drain
- Depending on the workload on this node other options may be needed. The two other frequently used options are
delete-local-data
- Continue even if there are pods using emptyDir (local data that will be deleted when the node is drained)
force
- Continue even if there are pods that do not declare a controller.
- Other drain options
kubectl drain wph-wld-rp01-md-0-64fc56fb95-mvbj5 --ignore-daemonsets
node/wph-wld-rp01-md-0-64fc56fb95-mvbj5 already cordoned
Warning: ignoring DaemonSet-managed Pods: kube-system/calico-node-r5sfv, kube-system/kube-proxy-52pdn, kube-system/vsphere-csi-node-nq4xd
node/wph-wld-rp01-md-0-64fc56fb95-mvbj5 drained
Make sure scheduling is disabled
kubectl get nodes
NAME STATUS ROLES AGE VERSION
wph-wld-rp01-control-plane-c7mxm Ready control-plane,master 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw Ready <none> 12m v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb Ready <none> 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5 Ready,SchedulingDisabled <none> 12m v1.21.2+vmware.1
Delete the node
kubectl delete node wph-wld-rp01-md-0-64fc56fb95-mvbj5
node "wph-wld-rp01-md-0-64fc56fb95-mvbj5" deleted
Observing changes in the management cluster
kubectl delete node
on workload cluster will trigger machine object deletion as well
- As seen from the output below the older machine object is deleted and the new machine `wph-wld-rp01-md-0-64fc56fb95-wqt7d`` is being provisioned now
kubectl config use-context ph-mgmt-rp01-admin@ph-mgmt-rp01
Switched to context "ph-mgmt-rp01-admin@ph-mgmt-rp01".
kubectl get machines
NAME PROVIDERID PHASE VERSION
wph-wld-rp01-control-plane-c7mxm vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw vsphere://423cb203-15e6-e024-fb5c-fa62e555defa Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74 Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d Provisioning v1.21.2+vmware.1
# Provisioning Complete
kubectl get machines
NAME PROVIDERID PHASE VERSION
wph-wld-rp01-control-plane-c7mxm vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw vsphere://423cb203-15e6-e024-fb5c-fa62e555defa Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74 Running v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d vsphere://423cf20b-d028-993d-f452-f9c303e98ce5 Running v1.21.2+vmware.1
Verify new worker node is added
- New node
wph-wld-rp01-md-0-64fc56fb95-wqt7d
is added with the same id as the newly spun up machine wph-wld-rp01-md-0-64fc56fb95-wqt7d
from the output above
kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01
kubectl get nodes
NAME STATUS ROLES AGE VERSION
wph-wld-rp01-control-plane-c7mxm Ready control-plane,master 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw Ready <none> 19m v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb Ready <none> 17d v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d Ready <none> 4m53s v1.21.2+vmware.1
Alternative Approach