This article is contributed. See the original author and article here.

This year at NVIDIA GTC, we’re excited to expand on our announcement of new GPU VM sizes available to Azure customers that continue our promise of providing VMs for every workload, at any scale- from a few inferences, to training multi-billion parameter natural language models.


The first is a flagship scale-up and scale-out offering, built on the NVIDIA A100 Tensor Core GPU– the ND A100 v4 platform- for users requiring the next frontier of AI training capabilities, which we unveiled in August. Equally important, however, in rounding out the Azure AI GPU portfolio, is a new and efficient NVIDIA T4 Tensor Core GPU-equipped VM series. The NC T4 v3 VMs are a cost-effective yet powerful option for customers looking to deploy machine learning in real-time production contexts.


Azure is committed to offering our customers the performance, pricing, unique features and worldwide availability they need to be successful in the rapidly changing landscape of accelerated cloud infrastructure. Four other generations of NVIDIA GPUs are also available to our customers worldwide, with a dozen Virtual Machine size options, including the industry’s only public cloud VMs with NVIDIA Mellanox InfiniBand scale-out networking.


Flagship NVIDIA A100– powered Azure VMs for Demanding AI Training and Compute


During the AI and GPU Infrastructure at Every Scale in Azure session [A22245TW] at GTC, we’ll be disclosing more details around the NVIDIA A100-powered ND A100 v4 product for high-end GPU computing in the cloud. This brand new offering, built around eight NVIDIA A100 Tensor Core GPUs, is poised to fulfill the most demanding AI training requirements for customers with large models, and enable a new class of tightly coupled, distributed HPC workloads. With ND A100 v4, customers can harness the power of interconnected GPUs, each connected via PCIe Gen 4 with an independent NVIDIA Mellanox HDR 200 Gigabit InfiniBand link.


Scalable from eight GPUs and a single VM, to hundreds of VMs and thousands of A100 GPUs- a scale rivaling some of the world’s largest supercomputers- clustering is turnkey with out-of-box support for NVIDIA’s NCCL2 communication library. Distributed workloads benefit from topology-agnostic VM placement and a staggering 1.6 Terabits of interconnect bandwidth per virtual machine, configured automatically via standard Azure deployment constructs such as VM Scale Sets.


Designed in close partnership with NVIDIA, the ND A100 v4 supports the same industry standard Machine Learning frameworks, communication libraries, and containers commonly deployed on-premise as a standard Azure VM offering.


The ND v4 is in limited preview now, with an expanded public preview arriving late this year.


NVIDIA T4Powered Azure VMs for Right-Sizing Cloud GPU Deployments


Also new to Azure are NVIDIA T4 Tensor Core GPUs, providing a hardware-accelerated VM for AI/ML workloads with a low overall price point.


The NC T4 v3 series provides a cost-effective, lightweight GPU option for customers performing real-time or smallbatch inferencing, enabling them to right-size their GPU spending for each workload. Many customers don’t require the throughput afforded by larger GPU sizes, and desire a wider regional footprint. For these use cases, adopting the NC T4 v3 series with no additional workload-level changes is an easy cost optimization. For customers with larger inferencing needs, Azure continues to offer NVIDIA V100 VMs such as the original NC v3 series, in one, two and four GPU sizes.


The NC T4 v3 is an ideal platform for adding real-time AI-powered intelligence to services hosted in Azure. The new NC T4 3 VMs are currently available for preview in the West US 2 region, with 1 to 4 NVIDIA T4 Tensor Core GPUs per VM, and will soon expand in availability with over a dozen planned regions across North America, Europe and Asia.


To learn more about NCasT4_v3-series virtual machines, visit the NCasT4_v3-series documentation.


Azure Stack Hub and Azure Stack Edge is now available with GPUs


Azure Stack Hub is available with multiple GPU options. For example the NVIDIA V100 Tensor Core GPU, which enables customers to run compute intense machine learning workloads in disconnected or partially connected scenarios. The NVIDIA T4 Tensor Core GPU is available in both Azure Stack Hub and Azure Stack Edge and provides visualization, inferencing, and machine learning for less computationally intensive workloads. With NVIDIA Quadro Virtual Workstation running on the NVIDIA T4 instance of Azure Stack Hub, creative and technical professionals can access the next generation of computer graphics and RTX-enabled applications, enabling AI-enhanced graphics and photorealistic design.


Learn more about these recent announcements here: Azure Stack Hub and Azure Stack Edge.


Microsoft Sessions at NVIDIA GTC


Session Title



Session Type

Accelerate Your ML Life Cycle on Azure Machine Learning (A22247)

Azure Machine Learning helps scale out deep learning training and inferencing using distributed deep learning frameworks to parallelize computation with NVIDIA GPUs. See how you can scale your ML workflows in a repeatable manner and automate and drive cost efficiencies into the entire ML life cycle across model building, training, deployment, and management.


Keiji Kanazawa Principal Program Manager, Microsoft


Tuesday, Oct 6, 08:00 – 08:50 PDT

Accelerating PyTorch Model Training and Inferencing with NVIDIA GPUs and ONNX Runtime [A22270]

Extend the Microsoft experience of training large natural language model with ONNX Runtime and Azure Machine Learning to your own scenarios.


Natalie Kershaw
Senior Program Manager
, Microsoft

On Demand

Visualization and AI inferencing using Azure Stack Hub and Azure Stack Edge [A22269]

Learn how you can use Azure Stack Hub or Edge to run machine learning models at the edge locations close to where the data is generated. We’ll demonstrate how to train the model in Azure or Azure Stack Hub to run the same application code across cloud GPU and edge devices and distribute these models to edge, using hosted services like Azure IoT Edge or as Kubernetes containers.


Vivek N Darera
Principal Program Manager
, Microsoft

On Demand





AI and GPU Infrastructure at Every Scale in Azure [A22245]

Join us for a brief walkthrough of Azure’s GPU-equipped infrastructure capabilities, from visualization to massively-parallel AI training supercomputers powered by the latest NVIDIA A100 GPUs and Mellanox HDR InfiniBand. From the cutting edge of AI research to production deployment of ML services worldwide — and any scenario in between — Azure has a VM for every workload.


Ian Finder
Senior Program Manager
, Microsoft

On Demand

Brought to you by Dr. Ware, Microsoft Office 365 Silver Partner, Charleston SC.