Does Replicated + Embedded Kubernetes Appliance self-heal the cluster?

dex · July 31, 2019, 11:21pm

What sort of self-healing / auto-scaling is available to clusters deployed on-premise?

dex · July 31, 2019, 11:21pm

The Ceph cluster will automatically rebalance storage in the case of node failure. It will enable a data replication factor computed as max(3, cluster_size).

When worker nodes are lost, Kubernetes will rebalance workloads. As long as sufficient compute resources are available, the cluster can tolerate the loss of any number of worker nodes without downtime. If the number of lost worker nodes is so great so as to render some pods unschedulable, the pods will be scheduled once additional nodes are booted by the cluster operator.

A 3-master cluster can tolerate the loss of a single master node without failure. A 5-master cluster can tolerate the loss of 2 master nodes without failure.

At the moment, Replicated doesn’t communicate with any hypervisor to auto-replace lost nodes. Lost nodes must be replaced by the end customer IT admin, or by an automated system that they configure. Because new nodes can be joined on boot via optional mounted config files, this process can be automated fairly easily.

Topic		Replies	Views
Managed Kubernetes	3	102	December 11, 2023
How to safely resize a kURL cluster containing rook-ceph nodes Troubleshooting	0	1430	September 27, 2022
How do you configure replicated for high availability? How do I?	3	1144	March 20, 2022
Removing nodes from Kubernetes clusters Supporting your customers	1	2858	September 10, 2020
Ceph PersistentVolumeClaim Resizing Packaging an application	0	1008	March 24, 2020

Does Replicated + Embedded Kubernetes Appliance self-heal the cluster?

Related Topics