In Kubernetes, what’s the best way to increase the number of service instances as the number of nodes in the cluster increases? For example, I might run a Deployment with 1 or 2 replicas to start, but if several nodes are added I’d like to scale this up to match the number of nodes in the cluster.
If you want to run with
#replicas=#nodes, there’s a simple solution - daemonsets. This will run one pod on each node, no matter how many nodes there are.
I’ll address scaling based on load in a future comment.
Scaling pods based on load is not quite as easy as using a daemonset to run one pod on each node, but is doable. The Horizontal Pod Autoscaler allows scaling the number of nodes based on the consumption of a resource, as shown here:
apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: php-apache spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: php-apache minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50
That would autoscale a deployment named ‘php-apache’ from 1 to 10 replicas, targeting an average CPU usage of 50%. Be sure to set CPU limits unless you want the container to scale based on its CPU usage as a fraction of the instance it is running on!
When designing a deployment to be autoscaled, it can also be worthwhile to setup pod anti-affinity to reduce the chance of multiple copies of the same pod competing for a limited resource on one node, while no copies run on another. In general, affinities can be extremely powerful tools. The following php-apache deployment will prefer to schedule pods on nodes that do not already have a php-apache pod running, but will also attempt to be colocated in the same failure domain as a database pod.
apiVersion: apps/v1 kind: Deployment metadata: name: php-apache labels: app: php-apache spec: selector: matchLabels: app: php-apache template: metadata: labels: app: php-apache spec: affinity: podAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 5 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - database topologyKey: failure-domain.beta.kubernetes.io/zone podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 10 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - php-apache topologyKey: kubernetes.io/hostname containers: - name: php-apache image: php:5-apache
However powerful it is, this autoscaler has requirements of its own. In order to function, the kubernetes metrics server must be installed on your cluster. As of 2.31.1, this is not included in Replicated installations by default and will need to be included in your app yaml - and preferably run within your app’s namespace. Adding resources to the
kube-system namespace is not supported.
Autoscaling is more complicated, but also more powerful. Choose the right tool for the job, and make sure to scale based on the limiting resource!