Business & Company Culture
Cloud Platforms & Serverless
Kubernetes Ecosystem
Microservices & Software Architecture
Observability & Monitoring

AWS Auto Scaling: Getting it Right

Jun 1, 2021

AWS Auto Scaling lets you automatically scale AWS resources up and down. AWS Auto Scaling comes with a single user console interface, which enables you to configure auto scaling across multiple AWS resources and services, for entire applications or individual resources.

What is AWS Auto Scaling?

In addition to the console, AWS Auto Scaling comes with a feature called “scaling plan”, which enables scaling based on application load metrics. This feature ensures that there is enough computing power to support the resources of your application.

Common use cases of AWS Auto Scaling include optimizing applications and workloads with a traffic flow that changes on a weekly or daily basis. For example, you can leverage auto scaling for apps with cyclical traffic, which may require a high quantity of resources during business hours and a low quantity during the night. You can also use auto scaling for batch processing, periodic analysis, testing, growth spikes, and marketing campaigns.

Which Types of Resources Can AWS Auto Scaling Manage?

Auto Scaling supports the following AWS resources:

  • EC2 Auto Scaling groups—contain collections of EC2 instances. This feature enables you to terminate or launch the EC2 instances located in Auto Scaling groups.
  • EC2 Spot Fleet requests—enables you to automatically terminate or launch instances from located in a Spot Fleet. You can also replace instances that get interrupted, to optimize costs and capacity.
  • ECS—can respond to load variations by adding or removing containers.
  • DynamoDB—adjust capacity of provisioned reads and writes, either for tables or secondary indexes, to manage traffic increases without throttling.
  • Aurora—can dynamically add or remove read replicas, helping you automatically handle changes in active workloads and connections.


AWS does not currently support Auto Scaling for:

  • EBS volumes—EC2 auto scaling is not stateful. AWS EBS volumes mounted to old EC2 instances cannot be scaled along with new EC2 instances.
  • Elastic storage services—including Simple Storage Service (S3) and Elastic File Storage (EFS), are built for scalability and do not need extended scaling features.
  • Relational Database Service (RDS)—comes with its own unique scaling behavior, which is not compatible with AWS Auto Scaling.
  • Serverless products—from the AWS serverless ecosystem, such as AWS Lambda and AWS Lambda Edge, cannot be scaled using AWS Auto Scaling.
  • Container services—such as Elastic Kubernetes Service (EKS), do not work with AWS Auto Scaling. Use native capabilities for Kubernetes deployments, such as Cluster Autoscaler, when auto scaling containers.

How Scaling Plans Work

Scaling plans form the foundation of AWS Auto Scaling. The plans define sets of instructions for scaling AWS resources. To simplify organization and fully leverage scaling, you can use CloudFormation and tags to categorize your resource. You can use recommended scaling strategies provided by the program to customize either each resource or groups.

To ensure optimal results, you can combine dynamic scaling with predictive scaling. Dynamic scaling is based on metrics from AWS monitoring infrastructure, primarily Amazon CloudWatch.

Scaling Strategies

Scaling strategies define how AWS Auto Scaling optimizes the use of resources in a scaling plan. You can choose between different scaling strategies, and control how resource utilization is optimized. You can optimize for availability, cost, or a combination of the two.

Dynamic Scaling

Dynamic scaling policies proactively modify resource capacity, responding to live changes in resource usage. This is based on a configurable load metric that measures actual load on your application. The goal is to provide the capacity needed to hit the target value defined in the scaling strategy.

Predictive Scaling

The predictive scaling capability uses machine learning to analyze the historical workload of each resource. Using this analysis, the software can regularly predict future loads for the following two days. It automatically creates scaling actions, which provision the required resources to hit the predicted target utilization value.

AWS Auto Scaling Best Practices

The following best practices can help you make more effective use of Auto Scaling.

Use EC2 Instance Metrics With 1-Minute Frequency

To ensure fast response to changes in application metrics, you should use a 1-minute frequency when scaling EC2 instances. If you use a 5-minutes frequency scaling, you might end up with slow responses based on stale metrics.

Note that by default, metric data is available at 5-minute intervals. To get it down to one minute, you will be charged extra. To see actual capacity data in the forecast graphs, you need to enable Auto Scaling Groups and complete the Create Scaling Plan wizard.

Pay Attention to Instance Types

Auto Scaling groups can use different EC2 instance types. Each type provides different performance. T2 and T3 instances, for example, come with burstable performance. These instances provide a standard level of CPU performance and can burst to higher levels when needed. Each scaling plan defines the target utilization, but it is possible to exceed the standard level and run out of CPU credits. If that happens, performance will be impacted.

Use Predictive Scaling

The predictive scaling feature schedules future capacity according to workload forecasts. However, the quality of each forecast may bary, depending on how well the trained forecasting model can accurately assess the workload. This is often highly impacted by whether the cycles of the workload.

To assess forecast quality and subsequent scaling actions, you can run predictive scaling in forecast-only mode. You can set this up when creating a scaling plan, and change it back to forecast mode after completing the forecast quality assessment.

Correlate Scaling and Load Metrics

You can define custom metrics for predictive scaling. However, you need to strongly correlate load metrics and scaling metrics. Make sure that the metric value always decreases and increases proportionally to the amount of instances in your Auto Scaling group. The goal is to enable proportional scaling of instances.


AWS Auto Scaling enables you to scale a wide range of AWS resources It works with a wide range of resources, currently excluding EBS volumes, Elastic storage services, Relational Database Service (RDS), Serverless products and Container services. To make the best of AWS Auto scaling you should choose the suitable plan (or combine between them) according to your needs and priorities.

Stay tuned:

Behind the Tracks


Kubernetes Ecosystem

Docker, Kubernetes & Co

Microservices & Software Architecture

Maximize development productivity

Continuous Delivery & Automation

Build, test and deploy agile

Cloud Platforms & Serverless

Cloud-based & native apps

Monitoring, Traceability & Diagnostics

Handle the complexity of microservices applications


DevSecOps for safer applications

Business & Company Culture

Radically optimize IT

Organizational Change

Overcome obstacles on the road to DevOps

Live Demo #slideless

Showing how technology really works