Scaling Tanzu Kubernetes Grid control plane and worker nodes vertically
Horizontal scaling of a TKG cluster is a straightforward task. Vertical scaling is a more manual and involved approach at present. The TKG documentation highlights the high-level steps to achieve this by leveraging instructions from ClusterAPI documentation. This post aims to provide a walkthrough of the procedure with examples.
High level the process:
- Save and make a copy of an existing infrastructure template, VSphereMachineTemplate in this example
- Create and deploy a new VSphereMachineTemplate object with the updated configuration and a new name
- Update the MachineDeployment to vertically scale worker nodes and update KubeadmControlPlane to vertically scale the control plane nodes
Save existing VSphereMachineTemplate for the control plane and worker nodes¶
Switch the context to the management cluster. The commands below assume that the workload cluster that needs vertical scaling, is in the default namespace. If your cluster is in a namespace other than the default use -n <namespace>
to target the appropriate namespace of your workload cluster.
kubectl config use-context oom-mgmt-rp01-admin@oom-mgmt-rp01
Switched to context "oom-mgmt-rp01-admin@oom-mgmt-rp01".
Check the available templates in your environment and export them to a file.
kubectl get vspheremachinetemplates.infrastructure.cluster.x-k8s.io
NAME AGE
oom-wld-control-plane 16h
oom-wld-worker 16h
kubectl get vspheremachinetemplates.infrastructure.cluster.x-k8s.io oom-wld-control-plane -o yaml > oom-wld-control-plane-new.yaml
kubectl get vspheremachinetemplates.infrastructure.cluster.x-k8s.io oom-wld-worker -o yaml > oom-wld-worker-new.yaml
The control plane and worker node VSphereMachineTemplate should look similar to the output below
# Control Plane
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"VSphereMachineTemplate","metadata":{"annotations":{},"name":"oom-wld-control-plane","namespace":"default"},"spec":{"template":{"spec":{"cloneMode":"fullClone","datacenter":"/Datacenter","datastore":"/Datacenter/datastore/vsanDatastore","diskGiB":20,"folder":"/Datacenter/vm/env06","memoryMiB":4096,"network":{"devices":[{"dhcp4":true,"networkName":"/Datacenter/network/VM Network"}]},"numCPUs":4,"resourcePool":"/Datacenter/host/Cluster/Resources/RP06","server":"10.186.198.51","storagePolicyName":"","template":"/Datacenter/vm/tkg/ubuntu-2004-kube-v1-21-2+vmware-1-tkg-1-7832907791984498322"}}}}
creationTimestamp: "2021-09-14T15:35:25Z"
generation: 1
name: oom-wld-control-plane
namespace: default
ownerReferences:
- apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
name: oom-wld
uid: 80ec55e2-d809-4f80-864b-4ca02315bd23
resourceVersion: "731343"
uid: f7b9b894-be5b-4922-a7b1-51738e626709
spec:
template:
spec:
cloneMode: fullClone
datacenter: /Datacenter
datastore: /Datacenter/datastore/vsanDatastore
diskGiB: 20
folder: /Datacenter/vm/env06
memoryMiB: 4096
network:
devices:
- dhcp4: true
networkName: /Datacenter/network/VM Network
numCPUs: 4
resourcePool: /Datacenter/host/Cluster/Resources/RP06
server: 10.186.198.51
storagePolicyName: ""
template: /Datacenter/vm/tkg/ubuntu-2004-kube-v1-21-2+vmware-1-tkg-1-7832907791984498322
# Worker Nodes
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"VSphereMachineTemplate","metadata":{"annotations":{},"name":"oom-wld-worker","namespace":"default"},"spec":{"template":{"spec":{"cloneMode":"fullClone","datacenter":"/Datacenter","datastore":"/Datacenter/datastore/vsanDatastore","diskGiB":20,"folder":"/Datacenter/vm/env06","memoryMiB":8192,"network":{"devices":[{"dhcp4":true,"networkName":"/Datacenter/network/VM Network"}]},"numCPUs":4,"resourcePool":"/Datacenter/host/Cluster/Resources/RP06","server":"10.186.198.51","storagePolicyName":"","template":"/Datacenter/vm/tkg/ubuntu-2004-kube-v1-21-2+vmware-1-tkg-1-7832907791984498322"}}}}
creationTimestamp: "2021-09-14T15:35:25Z"
generation: 1
name: oom-wld-worker
namespace: default
ownerReferences:
- apiVersion: cluster.x-k8s.io/v1alpha3
kind: Cluster
name: oom-wld
uid: 80ec55e2-d809-4f80-864b-4ca02315bd23
resourceVersion: "731305"
uid: 9e23416a-9412-4fc1-b192-0db0893f14d8
spec:
template:
spec:
cloneMode: fullClone
datacenter: /Datacenter
datastore: /Datacenter/datastore/vsanDatastore
diskGiB: 20
folder: /Datacenter/vm/env06
memoryMiB: 8192
network:
devices:
- dhcp4: true
networkName: /Datacenter/network/VM Network
numCPUs: 4
resourcePool: /Datacenter/host/Cluster/Resources/RP06
server: 10.186.198.51
storagePolicyName: ""
template: /Datacenter/vm/tkg/ubuntu-2004-kube-v1-21-2+vmware-1-tkg-1-7832907791984498322
Remove the fields containing object metadata from the yaml files above. Post metadata removal the final yaml should look similar to the ones below. After the object metadata removal, in this scenario we want to bump up the CPU from 4 to 8 for both control plane and worker nodes. The two fields that needs update in order to achieve this are metadata.name
and .spec.template.spec.numCPUs
. These fields have been updated in the yaml files(oom-wld-control-plane-new.yaml
and oom-wld-worker-new.yaml
) as shown below.
# Control Plane
# cat oom-wld-control-plane-new.yaml
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
name: oom-wld-control-plane-8cpu
namespace: default
spec:
template:
spec:
cloneMode: fullClone
datacenter: /Datacenter
datastore: /Datacenter/datastore/vsanDatastore
diskGiB: 20
folder: /Datacenter/vm/env06
memoryMiB: 4096
network:
devices:
- dhcp4: true
networkName: /Datacenter/network/VM Network
numCPUs: 8
resourcePool: /Datacenter/host/Cluster/Resources/RP06
server: 10.186.198.51
storagePolicyName: ""
template: /Datacenter/vm/tkg/ubuntu-2004-kube-v1-21-2+vmware-1-tkg-1-7832907791984498322
# Worker node
# cat oom-wld-worker-new.yaml
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
name: oom-wld-worker-8cpu
namespace: default
spec:
template:
spec:
cloneMode: fullClone
datacenter: /Datacenter
datastore: /Datacenter/datastore/vsanDatastore
diskGiB: 20
folder: /Datacenter/vm/env06
memoryMiB: 8192
network:
devices:
- dhcp4: true
networkName: /Datacenter/network/VM Network
numCPUs: 8
resourcePool: /Datacenter/host/Cluster/Resources/RP06
server: 10.186.198.51
storagePolicyName: ""
template: /Datacenter/vm/tkg/ubuntu-2004-kube-v1-21-2+vmware-1-tkg-1-7832907791984498322
Deploy the new VSphereMachineTemplate objects¶
The next step is to deploy or create these templates in the management cluster
kubectl apply -f oom-wld-control-plane-new.yaml
vspheremachinetemplate.infrastructure.cluster.x-k8s.io/oom-wld-control-plane-8cpu created
kubectl apply -f oom-wld-worker-new.yaml
vspheremachinetemplate.infrastructure.cluster.x-k8s.io/oom-wld-worker-8cpu created
View the newly created templates
kubectl get vspheremachinetemplates.infrastructure.cluster.x-k8s.io
NAME AGE
oom-wld-control-plane 16h
oom-wld-control-plane-8cpu 12s
oom-wld-worker 16h
oom-wld-worker-8cpu 13s
At this point these new templates are ready to be used but the actual vertical scaling has not kicked in yet.
Scaling TKG control plane and worker nodes vertically¶
Scaling Control plane nodes¶
To scale the control plane nodes modify the KubeadmControlPlane object and change the spec.infrastructureTemplate.name
field of the KubeadmControlPlane object.
kubectl get kubeadmcontrolplanes.controlplane.cluster.x-k8s.io oom-wld-control-plane -o jsonpath='{.spec.infrastructureTemplate.name}{"\n"}'
oom-wld-control-plane
kubectl edit kubeadmcontrolplanes.controlplane.cluster.x-k8s.io oom-wld-control-plane
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/oom-wld-control-plane edited
kubectl get kubeadmcontrolplanes.controlplane.cluster.x-k8s.io oom-wld-control-plane -o jsonpath='{.spec.infrastructureTemplate.name}{"\n"}'
oom-wld-control-plane-8cpu
The commands above will trigger an update of the control plane.
kubectl get machine
NAME PROVIDERID PHASE VERSION
oom-wld-control-plane-gm4rw vsphere://42292fa2-3c30-bee4-467f-e509e320f76b Running v1.21.2+vmware.1
oom-wld-control-plane-kpxxm Provisioning v1.21.2+vmware.1
oom-wld-md-0-57f7b8f8d8-fn487 vsphere://42293388-c4db-af6b-db38-2f31b6eb3381 Running v1.21.2+vmware.1
oom-wld-md-0-57f7b8f8d8-lpdnj vsphere://4229269c-11f0-06b5-9ba1-5fc569fa94b2 Running v1.21.2+vmware.1
Scaling worker nodes¶
To scale the worker nodes modify the MachineDeployment object and change the spec.template.spec.infrastructureRef.name
field of the MachineDeployment object.
kubectl get md oom-wld-md-0 -o jsonpath='{.spec.template.spec.infrastructureRef.name}{"\n"}'
oom-wld-worker
kubectl edit md oom-wld-md-0
machinedeployment.cluster.x-k8s.io/oom-wld-md-0 edited
kubectl get md oom-wld-md-0 -o jsonpath='{.spec.template.spec.infrastructureRef.name}{"\n"}'
oom-wld-worker-8cpu
Similar to the behavior we observed during control plane node scaling, the commands above will trigger the rollout of new worker nodes.
kubectl get machines
NAME PROVIDERID PHASE VERSION
oom-wld-control-plane-gm4rw vsphere://42292fa2-3c30-bee4-467f-e509e320f76b Running v1.21.2+vmware.1
oom-wld-md-0-57f7b8f8d8-fn487 Provisioning v1.21.2+vmware.1
oom-wld-md-0-64cf7d8b7-pxb8d vsphere://4229e551-60fd-c514-083a-d62b9c703253 Running v1.21.2+vmware.1
oom-wld-md-0-64cf7d8b7-zvszp vsphere://422925b0-a046-f35c-dbc2-0f148f9c2d04 Running v1.21.2+vmware.1
Once the provisioning finishes your control plane and worker nodes will be created with the new desired configuration.