Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Vertically scaling your Amazon EKS deployments

Here's how to successfully deploy a Vertical Pod Autoscaler (VPA) into your Amazon EKS cluster.

Jun 11, 2024 • 4 Minute Read

Please set an alt value for this image...
  • Cloud
  • IT Ops
  • Guides
  • AWS

You’ve finally grasped Kubernetes, and you think you have the perfect application deployment set up for your web app. 

Fast-forward one month post launch . . .

The good news: Your brilliant application is famous and you’re getting far more traffic than anticipated. The bad news: The containers on your Pods are crashing due to insufficient resource limitations. 

You’ve tried horizontally scaling your Pods, but it doesn’t resolve the problem. After meticulous research, you’ve decided to try vertically scaling your Pods. One simple Vertical Pod Autoscaler deployment later, it’s smooth sailing!

What is the Vertical Pod Autoscaler (VPA)?

The Vertical Pod Autoscaler, or VPA for short, “automatically adjusts the CPU and memory reservations for your Pods to help ‘right size’ your applications.”*

In other words, you get to leverage this smart Kubernetes component to handle the heavy lifting when setting appropriate memory and CPU reservations. Instead of adding and removing Pods (Horizontal Pod Autoscaler or HPA), you scale the resource allocation (e.g., CPU, memory) of a Pod based on usage.

Key differences between HPA and VPA

How do the two types of scaling differ?

  • Scaling direction: HPA scales the number of pods (horizontal), while VPA scales the resource allocation (vertical).

  • Triggers: HPA is typically triggered by CPU utilization or custom metrics, while VPA is triggered by resource usage (e.g., CPU, memory).

  • Purpose: HPA is designed to handle changes in demand, while VPA is designed to optimize resource utilization.

When to choose one autoscaler over another

Why would you use one scaling mechanism instead of the other? Great question. Let’s talk about when to use VPA over HPA.

  1. Use VPA when you want to optimize resource allocation for a Pod, especially when the workload is variable or unpredictable.
  2. Use VPA when you want to ensure a Pod has the necessary resources to run efficiently without over- or under-provisioning (also known as “right-sizing”).

TL;DR: VPA is useful when you aim to optimize resource allocation for a Pod, while HPA is useful when you need to handle changes in demand by rapidly scaling out or in.

Create and use VPA in your cluster

Let's take a quick walk-through of how to use the VPA in your cluster. We’ll condense a simple application sample from the documentation. For this walk-through, we assume you’ve set up your AWS CLI credentials via the configuration files or the environment variables.

Note: If you’re following along outside our Pluralsight cloud playgrounds, please ensure you clean up your resources after you’re done.

Step 1: Create a cluster

Create a simple default Kubernetes cluster in Amazon EKS:

      eksctl create cluster --region us-east-1
    

Wait a bit and then verify the Pods are up and running:

      kubectl get deployment metrics-server -n kube-system
    

Step 3: Begin VPA deployment

Now we can deploy the VPA to our cluster. Change directories to where you’re running this from, or where your manifest files live, and run the following commands:

      git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler/
./hack/vpa-up.sh

    

This installs several required components using a set of predefined scripts within the cloned GitHub repository. After you deploy the VPA, check the status of the Pods deployment:

      kubectl get pods -n kube-system
    

Step 4. Test

Now we can test the vertical scaling of our Pods using a predefined application and script:

      kubectl apply -f examples/hamster.yaml
    

After a moment, you can list your Pods and describe one of the listed Pods in the output (hamster-c6967774f-q4x99 will be different for you):

      kubectl get pods -l app=hamster
kubectl describe pod hamster-c6967774f-q4x99

    

You’ll see a set CPU and memory reservation for the Pod you selected:

After the application has run for a bit, you should see new optimized Pods being launched. Watch for the new deployment:

      kubectl get --watch Pods -l app=hamster
    

Once you see a new Pod, check for the optimized reservations (hamster-c6967774f-cbvnp is meant to represent your newly launched, right-sized Pods):

      kubectl describe pod hamster-c6967774f-cbvnp
    

You’ll see the newly set CPU and memory reservations via the newly launched Pod:

Step 5. Clean it up

Be sure to clean up everything once you’re done:

      kubectl delete -f examples/hamster.yaml
    

And, if needed, you can delete the entire cluster as well:

      eksctl delete cluster --name YOUR_CLUSTER_NAME
    

Summary

And that’s it! You’ve now successfully deployed the VPA into your Amazon EKS cluster.

If you’re interested in learning more about Amazon EKS, scaling capabilities, and how to get some sample applications up and running quickly, check out our Amazon EKS Quickstart course at Pluralsight.