Configuring Kubernetes HPA on a K8s Cluster

3 min readMay 14, 2020

By Anushka Arora, The force behind the content that one sees on Devtron loves sharing her knowledge with people.

Horizontal Pod Scaler

Horizontal Pod Scaler automatically scales the number of pods in a replication controller, deployment, replica set, or stateful set based on observed CPU utilization.

This blog will explain how you configure HPA (Horizontal Pod Scaler) on a Kubernetes Cluster.

Prerequisites to Configure K8s HPA

Ensure that you have a running Kubernetes Cluster and kubectl, version 1.2 or later.
Deploy Metrics-Server Monitoring in the cluster provides metrics via resource metrics API, as HPA uses this API to collect metrics. To know about deploying of Metrics-Server, Click on this GitHub Repository: Deploy-Metrics-Server.
If you want to use custom metrics, your cluster must be able to communicate with the API server providing the custom metrics API.

Below are the steps of how you deploy an application and Configure HPA on Kubernetes Cluster:

Deploy an Application using Docker

Here, we are using a custom Docker image based on the PHP-apache image.
Create a Docker file with the following content:

FROM php:5-apache
ADD index.php /var/www/html/index.php
RUN chmod a+rx index.php

Below is the index.php page, which performs calculations to generate intensive CPU load.

<?php   $x = 0.0001;

Then start a deployment running the image and expose it as a service using the following YAML Configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
	name: php-apache
spec:
	selector:
    	matchLabels:
        	run: php-apache
    replicas: 1
    template:
    	metadata:
        	labels:
            	run: php-apache
        spec:
        	containers:
            - name: php-apache
            image: k8s.gcr.io/hpa-example
            ports:
            - containerPort: 80
            resources:
            	limits:
                	cpu: 500m
                requests:
                cpu: 200m
                
---

apiVersion: v1
kind: Service
metadata:
	name: php-apache
	labels:
		run: php-apache
spec:
	ports:
		-port: 80
	selector:
run: php-apache

Then, run the following command:

kubectl apply -f https://k8s.io/examples/application/php-apache.yaml

Create Horizontal Pod Autoscaler

Now that the server is running, create an autoscaler using kubectl autoscale.

When you create a Horizontal Pod Autoscaler, it will maintain between 1 to 10 replicas of Pods controlled by the PHP-apache deployment that you created in the above step. HPA will continue to increase or decrease replicas to maintain an average CPU Utilization across all pods of 50%. You can create Horizontal Pod Autoscaler using the following kubectl autoscale command:

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

You can check the current status of Autoscaler using the command :

kubectl get hpa

How HPA reacts to Increased Load?

Ensure that you run all the following commands in the different Terminal;

First, start the Docker Container

kubectl run --generator=run-pod/v1 -it --rm load-generator --image=busybox

Then, send an infinite loop of queries to the PHP-apache service

while true;
 do wget -q -O- http://php-apache.default.svc.cluster.local; done

You can check the higher CPU load by executing:

kubectl get hpa

Terminate the Load

You can now stop the user load. Switch to the terminal, where you had created the Docker Container with busybox image, and press + C.

You can verify if you have terminated the increased load using the command:

kubectl get hpa

The CPU utilization will be dropped to 0%, and HPA will autoscale the number of replicas back down to 1. The autoscaling of replicas may take a few minutes.

Result State:

NAME         REFERENCE                      TARGET    MINPODS
php-apache   Deployment/php-apache/scale   0% / 50%      1MAXPODS     REPLICAS   AGE
10           1         11

Originally published at https://devtron.ai on May 14, 2020.