This article is contributed. See the original author and article here.

Azure autoscale automatically increase or decrease the number of VM instances for an Azure virtual machine scale set (VMSS) that run your application. This automated and elastic behavior reduces the management overhead to monitor and optimize the performance of your application. You create rules that define the acceptable performance for a positive customer experience. When those defined thresholds are met, autoscale rules take action to adjust the capacity of your scale set. You can also schedule events to automatically increase or decrease the capacity of your scale set at fixed times.


Now you can use machine learning to help manage and scale out your Virtual Machine Scale Sets with Predictive autoscale. The capacity needs of your Virtual Machine Scale Sets are forecasted based on the historical CPU patterns. When enabled, the predicted overall CPU load is observed and scale-out occurs in advance and in time to meet the demand.

Predictive autoscale complements loads that have cyclical workload patterns and are especially advantageous when Virtual Machines have long provisioning times. Having the scale out occur before the workload demand covers peak load while lowering the costs.

Predictive autoscale is available for Virtual Machine Scale Sets and support scale out only for Percentage CPU metrics only. Users will have to continue to create standard autoscale settings when creating scale ins for Virtual Machine Scale Sets.

Predictive autoscale can be configured via Azure Portal, CLI and PowerShell.

To get started with predictive autoscale from Azure Portal navigate to their Virtual Machine Scale Set scaling blade and then to Predictive autoscale section


Using the Predictive Autoscale dropdown selection, you can:

  • Enable forecast only mode

  • Enable predictive autoscale

  • Disable predictive autoscale. This is the default setting when you first land on the page for predictive autoscale.



Once you’ve made your predictive autoscale selection (i.e. enable forecast only or predictive autoscale) and optionally configure how far in advance you want to scale out click on Save and then the process starts.

A new tab is now available in the scaling tab: Predictive Charts which provides you an in depth overview of the forecasted CPU as well as any autoscaling that occurs if predictive autoscale is enabled.

Please note that Predictive autoscale requires a minimum of 7 days of history to provide predictions. The most accurate results come from 15 days of historical data.





  • The top chart shows an overlaid comparison of actual versus predicted total CPU percentage. The time span of the graph shown is from the last 24 hours to the next 24 hours.

  • The middle chart shows the number of instances running at specific times over the last 24 hours.

  • The bottom chart shows the current Average CPU utilization over the last 24 hours

 In addition to this you have an option to enable/ disable forecast only or predictive autoscale as well as the ability to view the charts with time grain ranging from 12 hours to 7 days.



You can head over to Run History to review when scale outs occur due to predictive autoscale.



The forecast model observes the different capacity needs during the day and manages the overall load on an as needed basis. You will need to create a standard autoscale while creating predictive autoscale as it provides a fall back in case of say, missing forecast data. Predictive autoscale can work with other autoscale conditions, autoscale will always observe the request for highest number of resources to address any workload requirements.

Predictive autoscale is generally available in all public regions as of today, Oct 12, 2022. For more information you can refer to the documentation or email:

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.