AWS auto scaling picture

Quick Overview of The AWS Auto Scaling Group

what is autoscaling group?

AWS Auto Scaling (Automatic Scaling) is one of the essential AWS services that allows you to configure automatic scaling and maintenance for the resources that are part of your application according to their performance and workload.
You can configure automatic scaling for individual resources or for whole applications. Also use it to manage and configure Auto Scaling plans. A Scaling plans uses dynamic scaling and predictive scaling to automatically scale your application resources.  Through scaling plan you can decide on the scaling strategies based on cost or availability optimization or both for your application. Alternatively, you can create custom scaling strategies.

In practical terms, the AWS Auto Scaling helps you save cost by cutting down the number of EC2 instances when not needed, and scaling out to add more instances only when it is required. The AWS Auto Scaling console provides a single UI you can use for automatic scaling features of multiple AWS services.

advantages of aws auto scaling

AWS Auto Scaling is useful for applications that experience daily or weekly variations in traffic flow, including: 

  • Cyclical traffic such as high use of resources during regular business hours and low use of resources overnight. 
  • On and off workload patterns, such as batch processing, testing, or periodic analysis.
  • Variable traffic patterns, such as marketing campaigns with periods of spiky growth.
  • You can use AWS Auto Scaling to scale an EC2 instance, Spot Instances, DynamoDB, Amazon Aurora, Amazon ECS
application auto scaling

Application Auto Scaling is a web service for developers and system administrators who need a solution for automatically scaling their scalable resources for individual AWS services beyond Amazon EC2. Other resources you can also use AWS Auto Scaling for includes:

  • Amazon ECS services
  • Spot Fleet requests 
  • Amazon EMR clusters
  • AppStream 2.0 fleets 
  • DynamoDB tables and global secondary indexes 
  • Aurora replicas
  • Amazon SageMaker endpoint variants 
  • Custom resources provided by your own applications or services. 
  • Amazon Comprehend document classification endpoints 
  • Lambda function provisioned concurrency

You can also use Application Auto Scaling and Amazon EC2 Instance to scale resources across multiple services. AWS Auto Scaling can help you maintain optimal availability and performance by combining predictive scaling and dynamic scaling (proactive and reactive approaches, respectively) together to scale your Amazon EC2 capacity faster. 
Application Auto Scaling also allows you to automatically scale your scalable resources according to conditions that you define.

Components of aws auto scaling
Target tracking scaling

To create a target tracking scaling policy, you specify an Amazon CloudWatch metric and a target value that represents the ideal average utilization or throughput level for your application. With Taget Tracking policy, you can:

  • Scale a resource based on a target value for a specific CloudWatch metric.
  • Step scaling.
  • Scale a resource based on a set of scaling adjustments that vary based on the size of the alarm breach. 
  • Scheduled scaling.
  • Scale a resource based on the date and time.
EC2 Autoscaling Group

Auto Scaling Group allows your AWS compute needs to grow or shrink depending on your application workload requirements. It is a collection of EC2 instances treated as a logical grouping for the purposes management and automatic scaling.
AWS Auto scaling ensures that you have the right number of AWS EC2 instances for your needs at all times. It also helps you save cost by cutting down the number of EC2 instances when not needed, and scaling out to add more instances only when you require it. An Auto Scaling Group can have a minimum, maximum, and desired capacity of EC2 instances. You can edit it after you create it.

Launch Configuration

You can configure this from AWS Console or CLI. Also, you can create a launch configuration from scratch, from AMI image, or use an existing/running EC2 instance to create the launch configuration. It is good to ensure before using the launch configuration that the AMI used to launch the instance does already exist on AWS. EC2 instance tags, and any additional block store volumes that you create after the launching an instance will not be taken into account. Also, you can only specify one launch configuration for an Auto Scaling group at a time.
If you want to change your launch configurations, you have to create a new one, make the required changes, and then update the new one with your auto scaling groups.

ec2 launch scaling templates

The Launch scaling template is similar to a launch configuration. It also specifies AMI ID, Instance type, Security group, tags, key pair among other parameters that can be used to launch an instance in the auto scaling service.
Unlike Launch Configuration, Launch templates allow you to have multiple versions of a template, which you can then reuse to create other templates or template versions.
AWS recommend creating launch templates not to have access to all advanced EC2 features. When using launch templates, you can provision your capacity using on-demand and spot instances (which can not be done using launch configurations). With that you can achieve the desired scale, performance, and cost targets for your application. 
You can create a template from scratch, create a new version of an existing one or just copy parameters from a launch configuration, a running instance, or another template. You can also delete the versions used for testing your application when you no longer need them.
More so, you can not change or edit a Launch Template after you create it, however, you can still create versions of the templates.


Multi AZ and Rebalancing
Auto Scaling can span Multi-AZs within the same AWS region. However, not across regions. You can also determine which subnets will Auto Scaling Groups use to launch new instances in each AZ. You can use it to create Fault Tolerant designs within a region in AWS. Availability Zone Rebalance Auto Scaling service always tries to distribute EC2 instances evenly across AZs where you enable it. This means if AWS Auto Scaling finds that the number of EC2 instances launched by an Auto Scaling Group into subject AZs not balanced, the Auto Scaling service will initiate a Re-Balancing activity. Also, if Auto Scaling fails to launch instances in an AZ (for AZ failure or capacity unavailability), it will try in the other AZs defined for this AS Group until it succeeds. 
There is no additional cost for launching Auto Scaling service. You only pay for the resources that you use Auto Scaling groups with.
The AWS ASG works well with AWS ELB, Cloud Watch, and Cloud Trail. It is also compliant with PCI DSS.


You can attach one or more Elastic Load Balancers (Classic, Application, or Network Load Balancer) to your existing Auto Scaling Group. It is important to note that the ELB(s) must be in the same region as the AS Group, as well as the same VPC.
For Classic Load Balancer add the load balancer, for ALB & NLB to the Target group. 
Once you do this, any EC2 instance existing or added by the Auto Scaling Group will be automatically registered with the ASG defined ELB(s).
You do not need to register those instances manually on the ASG defined ELBs. The ELB(s) will become the focal point for any inbound traffic destined to the ASG EC2 instances or any resources you attach the Auto Scaling Group to.
It also honors connection draining configuration when de-registering an instance manually from an ELB. 


Auto Scaling classifies its EC2 instances health status as either Healthy or Unhealthy. By default, AS uses EC2 Status Checks only to determine the health status of an Instance. When you have one or more ELBs defined with the AS Group, you can configure AWS Auto Scaling to use “both” the EC2 Health Checks and the ELB Health Checks to determine the Instances health status. By default, Health Check Grace period is 300 seconds.
Grace Period Is the time Auto Scaling waits from the time an Instance comes into service (become In-Service) before checking its health status. A value of “zero” means no grace period and the instance health is checked once it is In-service. Also, until the Grace Period timer expires, any unhealthy status reported by EC2 status checks, or the ELB you attach to the AS Group, will not be acted upon.

After Grace Period expires, Auto Scaling would consider an Instance unhealthy in any of the following cases:

  • EC2 Status checks report to Auto Scaling an instance status other than running. If the instance status is impaired due to a host Hardware or Software Problem.
  • If you configure the ELB health checks to use the Auto Scaling, and the ELB reports the Instance as “Out-of-Service”.
  • You have multiple ELBs attached to the AS Group. Then any of them reports the EC2 instance status as “Out-of-Service”. 
  • Also, one source reporting the instance as unhealthy is enough for Auto Scaling to mark it for replacement.

AWS provides a couple of options to scale you application to meet the desired capacity of your application. The include:

  • Manual Scaling: Manually change the minimum, maximum, and desired capacity of your instance. Also you can manually attach/detach instances.
  • Cyclic (schedule based) scaling: Predictable load change
  • Dynamic (Event based) scaling: Scaling in response to an event/alarm. This can be Step Scaling policy or the Simple Scaling policy. Step Scaling policy use the Step Adjustment to increase or decrease the resource based on the CloudWatch alarm breach. Simple Scaling waits for the cool-down period of a newly created resources to complete before scaling again. More on this later.
  • Predictive scaling: Combines AWS Auto Scaling with On Demand scaling (Proactive and reactive together) to increase based on weekly or daily traffic flow patterns. It is very effective for Cyclical traffic, recurring on-and-off workload patterns like batch processing. It is also good for applications that take time to respond, causing noticeable latency during scale out. An ASG can have multiple policies attached to it at any time.

A dynamic scaling policy instructs Amazon EC2 Auto Scaling to track a specific CloudWatch metric. It defines what action to take when the associated CloudWatch alarm is in triggered. An alarm is an object that watches over a single metric (CPU utilization, memory, network in/out, etc.) You need to have a scale-out and a scale-in policy configured, which will instruct Auto Scaling what to do (Scale out or in) in response to Alarms. You can use Cloud Watch to monitor and generate the Alarms.


Simple Scaling: Single adjustment (up or down) in response to an alarm. It waits for a cool down timer to expire before responding to more alarms.
Cool down Period: Is the period of time auto scaling waits after a scaling activity (scale in or out) until the effect of the scaling activity becomes visible
Step Scaling: Multiple steps/adjustments.
Supports a warm-up timer: The time it will take a newly launched instance to be ready and contribute to the watched metric.
Warm-up period: The period of time before which a newly created EC2 instance by ASG, using step scaling, is not considered/counted toward the ASG metrics. 
Target Tracking Scaling: Increase or decrease the current capacity of the group based on a target value for a specific metric. 

An alarm is an object that watches over a single metric (CPU utilization, memory, network in/out…etc). You need to have a scale-out and a scale-in policy configured, which will instruct AWS Auto Scaling what to do (Scale out or in) in response to the alarms. You can also use Cloud Watch to monitor and generate the Alarms


The AWS Auto Scaling group can be monitored using the Health Check features. Other services that you can use to monitor Auto Scaling group includes CloudWatch, Health Check Dashboard, Cloud Trail logs, CloudWatch Dashboard, and Simple Notification Service. AWS EC2 service sends EC2 metrics to cloud watch about the ASG instances. You can use the Basic Monitoring which monitor AWS Auto Scaling every 5 minutes enabled by default free of charge. You can enable Detailed Monitoring for every 1 minute monitoring, although you will be charged for it.


The AWS Auto Scaling is a very important service when you want to enjoy the elasticity of Cloud Computing. It could save you a lot for your cloud spend and improve your application resilience too.






Leave a Reply

Your email address will not be published. Required fields are marked *