Repairing Bosh Created Persistent Disk Of TKGI Worker Nodes
TKGI worker nodes are bosh deployed and have atleast three disks attached to it if you are not using Kubernetes persistent volumes. These three disks are
- Default stemcell disk - mounted as the root partition, usually 3 GB in size
- Ephemeral disk - this is where all the logs and bosh jobs data get pushed on VM creation. These are mounted as
/var/vcap/data
- Persistent disks - these are attached to the VM to store data that needs to be available across VM recreates. These are mounted as
/var/vcap/store
There are times when this persistent disk can be corrupted due to underlying IaaS or filesystem issues. You can follow the steps mentioned here to recover these disks.
Info
If an ephemeral disk is corrupted recovery will be faster and better using bosh recreate
or bosh cck
Identify the node to run fsck¶
- In this example node with IP -
10.20.0.5
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
011704a1-5f0f-4cb9-bd91-f9ad7aec17e5 Ready <none> 20h v1.23.7+vmware.1 10.20.0.5 10.20.0.5 Ubuntu 16.04.7 LTS 4.15.0-191-generic containerd://1.6.4
8334e164-8e9b-4ffb-9c89-bfe015e094a8 Ready <none> 20h v1.23.7+vmware.1 10.20.0.4 10.20.0.4 Ubuntu 16.04.7 LTS 4.15.0-191-generic containerd://1.6.4
c649ec99-bb3a-4049-9c57-1751f6de271e Ready <none> 21h v1.23.7+vmware.1 10.20.0.3 10.20.0.3 Ubuntu 16.04.7 LTS 4.15.0-191-generic containerd://1.6.4
Identify the bosh VM corresponding to that node¶
bosh vms -d service-instance_77e44aad-1a76-4980-8d4e-43d7c273d167 | grep 10.20.0.5
worker/fcd09dc3-9e7a-4528-8015-22620b553f27 running az 10.20.0.5 vm-c2b8073f-949d-4891-b420-36769ecdee60 medium.disk true bosh-vsphere-esxi-ubuntu-xenial-go_agent/621.265
Drain the node¶
# Other drain options maybe needed if drain fails
kubectl drain 011704a1-5f0f-4cb9-bd91-f9ad7aec17e5 --ignore-daemonsets
node/011704a1-5f0f-4cb9-bd91-f9ad7aec17e5 cordoned
WARNING: ignoring DaemonSet-managed Pods: pks-system/fluent-bit-7rg24, pks-system/telegraf-xjsx4
evicting pod kube-system/coredns-67bd78c556-9vwfd
pod/coredns-67bd78c556-9vwfd evicted
node/011704a1-5f0f-4cb9-bd91-f9ad7aec17e5 drained
# Make sure scheduling is disabled
kubectl get nodes
NAME STATUS ROLES AGE VERSION
011704a1-5f0f-4cb9-bd91-f9ad7aec17e5 Ready,SchedulingDisabled <none> 20h v1.23.7+vmware.1
8334e164-8e9b-4ffb-9c89-bfe015e094a8 Ready <none> 20h v1.23.7+vmware.1
c649ec99-bb3a-4049-9c57-1751f6de271e Ready <none> 21h v1.23.7+vmware.1
monit stop all¶
# Turn off cck/resurrection
bosh update-resurrection off -d service-instance_77e44aad-1a76-4980-8d4e-43d7c273d167
bosh -d service-instance_77e44aad-1a76-4980-8d4e-43d7c273d167 ssh worker/fcd09dc3-9e7a-4528-8015-22620b553f27
# Steps on worker node
sudo su -
monit stop all
# To confirm everything stopped
monit summary
Identify the mount point¶
- Bosh persistent disks are mounted as
/var/vcap/store
, before repairing we must identify the filesystem path - From the output below
/var/vcap/store
is leveraging/dev/sdc1
df -h
Filesystem Size Used Avail Use% Mounted on
<------ Truncated Output ------>
/dev/sda1 2.9G 1.4G 1.4G 52% /
/dev/sdb1 32G 3.5G 27G 12% /var/vcap/data
tmpfs 16M 4.0K 16M 1% /var/vcap/data/sys/run
/dev/sdc1 50G 2.1G 45G 5% /var/vcap/store
<------ Truncated Output ------>
Unmount¶
- Before running repair the directory needs to be unmounted -
umount /var/vcap/store
- If
umount
fails because device is busy identify which processes have blocked the operation using fuser -m -u -v /dev/sdc1
orfuser -m -u -v /var/vcap/store
- These services will need to be stopped and processes that are accessing this will need to be terminated
Run fsck¶
fsck /dev/sdc1
fsck from util-linux 2.27.1
e2fsck 1.42.13 (17-May-2015)
/dev/sdc1: clean, 12599/3276800 files, 794069/13106688 blocks
Remount Disk¶
- You can confirm mount is successful using
Start all the processes¶
- As part of process stop and start,
kubelet
has also restarted which should bring the nodes out ofSchedulingDisabled
state