How to rotate certificates of a TKGS cluster without upgrading¶
When Tanzu Kubernetes clusters are created, certificates are generated during the kubeadm
initialization phase. These certificates expire after a year. These certificates are automatically rotated when you upgrade your clusters. But what if you can not upgrade your clusters within a year's time frame?
This post shares the steps on how certificates can be rotated without upgrading your clusters. At a high level this process triggers a rollout of control planes so they enter the kubeadm
initialization phase again and generate new certs. The steps below walk you through a certificate rotation in a TKGS environment. For a similar walkthrough in TKGM environment checkout this post
Notes¶
- ssh to one of the supervisor nodes to run these steps
- These steps have not been tested against clusters where certificates have already expired
- Environment Details
- vCenter - 7.0.3, Build 20395099
- ESXi - 7.0.3, Build 20328353
Retrieve Wokload Cluster Information¶
- These env variables will be used throughout this process
export CLUSTER_NAMESPACE="tanzu-support"
k get clusters -n $CLUSTER_NAMESPACE
NAME PHASE AGE VERSION
tanzu-support-cluster Provisioned 43h
# Get workload cluster kubeconfig
export CLUSTER_NAME="tanzu-support-cluster"
k get secrets -n $CLUSTER_NAMESPACE $CLUSTER_NAME-kubeconfig -o jsonpath='{.data.value}' | base64 -d > $CLUSTER_NAME-kubeconfig
# Get workload cluster ssh key
k get secrets -n $CLUSTER_NAMESPACE $CLUSTER_NAME-ssh -o jsonpath='{.data.ssh-privatekey}' | base64 -d > $CLUSTER_NAME-ssh-privatekey
chmod 600 $CLUSTER_NAME-ssh-privatekey
Environment before certificate rotation¶
export KUBECONFIG=$CLUSTER_NAME-kubeconfig
kubectl get nodes \
-o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}' \
-l node-role.kubernetes.io/master= > nodes
for i in `cat nodes`; do
printf "\n######\n"
ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i hostname
ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i sudo kubeadm certs check-expiration
done;
alias k='kubectl'
Sample Output¶
######
tanzu-support-cluster-control-plane-k8bqh
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Oct 04, 2023 23:00 UTC 363d no
apiserver Oct 04, 2023 23:00 UTC 363d ca no
apiserver-etcd-client Oct 04, 2023 23:00 UTC 363d etcd-ca no
apiserver-kubelet-client Oct 04, 2023 23:00 UTC 363d ca no
controller-manager.conf Oct 04, 2023 23:00 UTC 363d no
etcd-healthcheck-client Oct 04, 2023 23:00 UTC 363d etcd-ca no
etcd-peer Oct 04, 2023 23:00 UTC 363d etcd-ca no
etcd-server Oct 04, 2023 23:00 UTC 363d etcd-ca no
front-proxy-client Oct 04, 2023 23:00 UTC 363d front-proxy-ca no
scheduler.conf Oct 04, 2023 23:00 UTC 363d no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Oct 01, 2032 22:56 UTC 9y no
etcd-ca Oct 01, 2032 22:56 UTC 9y no
front-proxy-ca Oct 01, 2032 22:56 UTC 9y no
Rotating Certificates¶
- Make sure the kube-contenxt is switched back to supervisor cluster before running the steps below
unset KUBECONFIG
k config current-context
kubernetes-admin@kubernetes
k get kcp -n $CLUSTER_NAMESPACE $CLUSTER_NAME-control-plane -o jsonpath='{.apiVersion}{"\n"}'
controlplane.cluster.x-k8s.io/v1beta1
k get kcp -n $CLUSTER_NAMESPACE $CLUSTER_NAME-control-plane
NAME CLUSTER INITIALIZED API SERVER AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
tanzu-support-cluster-control-plane tanzu-support-cluster true true 3 3 3 0 43h v1.20.12+vmware.1
kubectl patch kcp $CLUSTER_NAME-control-plane -n $CLUSTER_NAMESPACE --type merge -p "{\"spec\":{\"rolloutAfter\":\"`date +'%Y-%m-%dT%TZ'`\"}}"
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/tanzu-support-cluster-control-plane patched
# Machine rollout started
k get machines -n $CLUSTER_NAMESPACE
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
tanzu-support-cluster-control-plane-k8bqh tanzu-support-cluster tanzu-support-cluster-control-plane-k8bqh vsphere://420a2e04-cf75-9b43-f5b6-23ec4df612eb Running 43h v1.20.12+vmware.1
tanzu-support-cluster-control-plane-l7hwd tanzu-support-cluster tanzu-support-cluster-control-plane-l7hwd vsphere://420a57cd-a1a0-fec6-a741-19909854feb6 Running 43h v1.20.12+vmware.1
tanzu-support-cluster-control-plane-mm6xj tanzu-support-cluster tanzu-support-cluster-control-plane-mm6xj vsphere://420a67c2-ce1c-aacc-4f4c-0564daad4efa Running 43h v1.20.12+vmware.1
tanzu-support-cluster-control-plane-nqdv6 tanzu-support-cluster Provisioning 25s v1.20.12+vmware.1
tanzu-support-cluster-workers-v8575-59c6645b4-wvnlz tanzu-support-cluster tanzu-support-cluster-workers-v8575-59c6645b4-wvnlz vsphere://420aa071-9ac2-02ea-6530-eb59ceabf87b Running 43h v1.20.12+vmware.1
# Machine Rollout Complete
k get machines -n $CLUSTER_NAMESPACE
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
tanzu-support-cluster-control-plane-m9745 tanzu-support-cluster tanzu-support-cluster-control-plane-m9745 vsphere://420a5758-50c4-3172-7caf-0bbacaf882d3 Running 17m v1.20.12+vmware.1
tanzu-support-cluster-control-plane-nqdv6 tanzu-support-cluster tanzu-support-cluster-control-plane-nqdv6 vsphere://420ad908-00c2-4b9b-74d8-8d197442e767 Running 22m v1.20.12+vmware.1
tanzu-support-cluster-control-plane-wdmph tanzu-support-cluster tanzu-support-cluster-control-plane-wdmph vsphere://420af38a-f9f8-cb21-e05d-c1bcb6840a93 Running 10m v1.20.12+vmware.1
tanzu-support-cluster-workers-v8575-59c6645b4-wvnlz tanzu-support-cluster tanzu-support-cluster-workers-v8575-59c6645b4-wvnlz vsphere://420aa071-9ac2-02ea-6530-eb59ceabf87b Running 43h v1.20.12+vmware.1
Verify certificate rotation¶
export KUBECONFIG=$CLUSTER_NAME-kubeconfig
kubectl get nodes \
-o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}' \
-l node-role.kubernetes.io/master= > nodes
for i in `cat nodes`; do
printf "\n######\n"
ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i hostname
ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i sudo kubeadm certs check-expiration
done;
######
tanzu-support-cluster-control-plane-m9745
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Oct 06, 2023 18:18 UTC 364d no
apiserver Oct 06, 2023 18:18 UTC 364d ca no
apiserver-etcd-client Oct 06, 2023 18:18 UTC 364d etcd-ca no
apiserver-kubelet-client Oct 06, 2023 18:18 UTC 364d ca no
controller-manager.conf Oct 06, 2023 18:18 UTC 364d no
etcd-healthcheck-client Oct 06, 2023 18:18 UTC 364d etcd-ca no
etcd-peer Oct 06, 2023 18:18 UTC 364d etcd-ca no
etcd-server Oct 06, 2023 18:18 UTC 364d etcd-ca no
front-proxy-client Oct 06, 2023 18:18 UTC 364d front-proxy-ca no
scheduler.conf Oct 06, 2023 18:18 UTC 364d no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Oct 01, 2032 22:56 UTC 9y no
etcd-ca Oct 01, 2032 22:56 UTC 9y no
front-proxy-ca Oct 01, 2032 22:56 UTC 9y no
Certificate duration reset to 364 days¶
kubelet certificate¶
- Rotation not need as
rotateCertificates
is set to true in kubelet config - This configuration can be verified using the commands below
kubectl get nodes \
-o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}' \
-l node-role.kubernetes.io/master!= > workernodes
for i in `cat workernodes`; do
printf "\n######\n"
ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i hostname
ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i sudo grep rotate /var/lib/kubelet/config.yaml
done;
######
tanzu-support-cluster-workers-v8575-59c6645b4-wvnlz
rotateCertificates: true