What do I have to do to reboot a Kubernetes Replicated host?


#1

Right now I’m testing replicated on a AWS EC2 instance and issuing a reboot causes the VM to hang indefinitely because of some ceph/rook volumes which cannot be unmounted (At least I think this is the problem. It’s not that easy to view into the VM once the reboot command is issued).


#2

The node needs to be drained before reboot. After the successful drain, the node can be rebooted as usual.

Because kubectl drain command automatically marks the node as unschedulable (kubectl cordon effect), the node needs to be uncordoned once it’s back online.

Drain the node:

$ kubectl drain <node-name> --ignore-daemonsets --delete-local-data

Uncordon the node:

kubectl uncordon <node-name>

https://github.com/rook/rook/blob/master/Documentation/common-issues.md#node-hangs-after-reboot


#3

Some systems may hang on reboot even after a kubectl drain. It is recommended to remove all pods that use a Rook-provisioned PVC prior to drain:

replicatedctl app stop
kubectl scale deployment replicated replicated-premkit retraced-postgres --replicas=0

Wait for those pods to terminate before running the drain command.

After reboot:

kubectl scale deployment replicated replicated-premkit retraced-postgres --replicas=1