What do I have to do to reboot a Kubernetes Replicated host?

ethanm · February 7, 2019, 4:17pm

Right now I’m testing replicated on a AWS EC2 instance and issuing a reboot causes the VM to hang indefinitely because of some ceph/rook volumes which cannot be unmounted (At least I think this is the problem. It’s not that easy to view into the VM once the reboot command is issued).

ethanm · February 7, 2019, 4:18pm

The node needs to be drained before reboot. After the successful drain, the node can be rebooted as usual.

Because kubectl drain command automatically marks the node as unschedulable (kubectl cordon effect), the node needs to be uncordoned once it’s back online.

Drain the node:

$ kubectl drain <node-name> --ignore-daemonsets --delete-local-data

Uncordon the node:

kubectl uncordon <node-name>

rook/ceph-common-issues.md at master · rook/rook · GitHub

areed · February 26, 2019, 2:31am

Some systems may hang on reboot even after a kubectl drain. It is recommended to remove all pods that use a Rook-provisioned PVC prior to drain. For single-node installs the drain step is not required after scaling down the deployments.

replicatedctl app stop -a
kubectl scale deployment replicated replicated-premkit retraced-postgres --replicas=0

If using the Rook shared filesystem, also scale down the snapshotter deployment:

kubectl scale deployment replicated-shared-fs-snapshotter --replicas=0

Wait for those pods to terminate before running the drain command.

After reboot:

kubectl scale deployment replicated replicated-premkit retraced-postgres --replicas=1
kubectl scale deployment replicated-shared-fs-snapshotter --replicas=1 # if used
replicatedctl app start

areed · August 6, 2020, 4:26pm

The AKA reboot service now handles removing all pods with rook mounts during shutdown. Due to race conditions during shutdown this script may not complete. To prevent corruption, always run /opt/replicated/shutdown.sh manually prior to shutting down a node.

Topic		Replies	Views
Managing nodes when the previous Rook version is in use might leave Ceph in an unhealthy state where mon pods are not rescheduled Supporting your customers kurl , rook	0	348	January 24, 2023
Removing nodes from Kubernetes clusters Supporting your customers	1	2852	September 10, 2020
Flexvolume creates deadlock and Deployment enters into Crashloopbackoff on node reboot Packaging an application	4	676	July 29, 2020
How to safely resize a kURL cluster containing rook-ceph nodes Troubleshooting	0	1420	September 27, 2022
kURL: How can I delete rook-ceph when you have kots application installed How do I? kurl	0	188	April 28, 2023

What do I have to do to reboot a Kubernetes Replicated host?

Related Topics