Removing nodes from Kubernetes clusters

areed · October 30, 2019, 7:03pm

In an HA Kubernetes cluster the REK operator will automatically purge failed nodes that have been unreachable for more than an hour. For master nodes this includes the following steps:

Delete the Deployment resource for the OSD from the rook-ceph namespace
Exec into the Rook operator pod and run the command ceph osd purge <id>
Delete the Node resource
Remove the node from the CephCluster resource named rook-ceph in the rook-ceph namespace unless storage is managed automatically with useAllNodes: true
(Masters only) Connect to the etcd cluster and remove the peer
(Masters only) Remove the apiEndpoint for the node from the kubeadm-config ConfigMap in the kube-system namespace

All of these steps can be performed manually if needed. For removing etcd peers, exec into one of the remaining etcd pods in the kube-system namespace. You can use the etcdctl CLI there with the certificates mounted in /etc/kubernetes/pki/etcd:

$ cd /etc/kubernetes/pki/etcd
$ etcdctl --endpoints=https://127.0.0.1:2379 --cert-file=healthcheck-client.crt --key-file=healthcheck-client.key --ca-file=ca.crt member list

a1316b56d7099abf: name=node-k7d4 peerURLs=https://10.128.0.124:2380 clientURLs=https://10.128.0.124:2379 isLeader=false
ab67f9f870c32907: name=node-wbf1 peerURLs=https://10.128.0.125:2380 clientURLs=https://10.128.0.125:2379 isLeader=false
d9228c5ac755a5c6: name=node-hrrm peerURLs=https://10.128.0.123:2380 clientURLs=https://10.128.0.123:2379 isLeader=true

$ etcdctl --endpoints=https://127.0.0.1:2379 --cert-file=healthcheck-client.crt --key-file=healthcheck-client.key --ca-file=ca.crt member remove a1316b56d7099ab

dmitriy · September 10, 2020, 4:14pm

Newer versions of etcdctl use different command line arguments

--cert=healthcheck-client.crt --key=healthcheck-client.key --cacert=ca.crt

Topic		Replies	Views
Managing nodes when the previous Rook version is in use might leave Ceph in an unhealthy state where mon pods are not rescheduled Supporting your customers kurl , rook	0	352	January 24, 2023
How to safely resize a kURL cluster containing rook-ceph nodes Troubleshooting	0	1430	September 27, 2022
kURL: How can I delete rook-ceph when you have kots application installed How do I? kurl	0	190	April 28, 2023
How to recover Rook-Ceph cluster when missing files under /var/lib/rook/exporter/ How do I? kurl	0	138	October 17, 2023
KOTS: managed Kubernetes ceph rook not available by default Packaging an application	4	1167	August 28, 2020

Removing nodes from Kubernetes clusters

Related Topics