Autoscaling in Kubernetes?


#1

In Kubernetes, what’s the best way to increase the number of service instances as the number of nodes in the cluster increases? For example, I might run a Deployment with 1 or 2 replicas to start, but if several nodes are added I’d like to scale this up to match the number of nodes in the cluster.


#2

If you want to run with #replicas=#nodes, there’s a simple solution - daemonsets. This will run one pod on each node, no matter how many nodes there are.

I’ll address scaling based on load in a future comment.


#3

Scaling pods based on load is not quite as easy as using a daemonset to run one pod on each node, but is doable. The Horizontal Pod Autoscaler allows scaling the number of nodes based on the consumption of a resource, as shown here:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

That would autoscale a deployment named ‘php-apache’ from 1 to 10 replicas, targeting an average CPU usage of 50%. Be sure to set CPU limits unless you want the container to scale based on its CPU usage as a fraction of the instance it is running on!

When designing a deployment to be autoscaled, it can also be worthwhile to setup pod anti-affinity to reduce the chance of multiple copies of the same pod competing for a limited resource on one node, while no copies run on another. In general, affinities can be extremely powerful tools. The following php-apache deployment will prefer to schedule pods on nodes that do not already have a php-apache pod running, but will also attempt to be colocated in the same failure domain as a database pod.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
  labels:
    app: php-apache
spec:
  selector:
    matchLabels:
      app: php-apache
  template:
    metadata:
      labels:
         app: php-apache
    spec:
      affinity:
        podAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 5
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - database
            topologyKey: failure-domain.beta.kubernetes.io/zone
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 10
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - php-apache
              topologyKey: kubernetes.io/hostname
      containers:
      - name: php-apache
        image: php:5-apache

However powerful it is, this autoscaler has requirements of its own. In order to function, the kubernetes metrics server must be installed on your cluster. As of 2.31.1, this is not included in Replicated installations by default and will need to be included in your app yaml - and preferably run within your app’s namespace. Adding resources to the kube-system namespace is not supported.

Autoscaling is more complicated, but also more powerful. Choose the right tool for the job, and make sure to scale based on the limiting resource!