NVIDIA Announces AI Foundry Services To Expand GenAI – Microsoft Azure Unveils H100 Instances, H200 Next Year

NVIDIA Announces AI Foundry Services To Expand GenAI - Microsoft Azure Unveils H100 Instances, H200 Next Year 1

NVIDIA is announcing its brand new AI Foundry Services initiative to further accelerate AI while Microsoft Azure offers the latest Hopper instances.

Press Release: NVIDIA announced an AI foundry service, a collection of NVIDIA AI Foundation Models, NVIDIA NeMo framework and tools, and DGX Cloud AI supercomputing and services that give enterprises an end-to-end solution for creating and optimizing custom generative AI models.

Using the AI foundry service, Amdocs, a leading provider of software and services for communications and media providers, will optimize enterprise-grade large language models for the telco and media industries to efficiently deploy generative AI use cases across their businesses, from customer experiences to network operations and provisioning. The LLMs will run on NVIDIA accelerated computing as part of the Amdocs amAIz framework.

The collaboration builds on the previously announced Amdocs-Microsoft partnership, enabling service providers to adopt these applications in secure, trusted environments, including on-premises and in the cloud.

As NVIDIA continues to collaborate with Microsoft to build state-of-the-art AI infrastructure, Microsoft is introducing additional H100-based virtual machines to Microsoft Azure to accelerate (mid-range) AI workloads.

At its Ignite conference in Seattle today, Microsoft announced its new NC H100 v5 VM series for Azure, the industry’s first cloud instances featuring NVIDIA H100 NVL GPUs. This offering brings together a pair of PCIe-based H100 GPUs connected via NVIDIA NVLink, with nearly 4 petaflops of AI compute and 188GB of faster HBM3 memory. The NVIDIA H100 NVL GPU can deliver up to 12x higher performance on GPT-3 175B over the previous generation and is ideal for inference and mainstream training workloads.

Additionally, Microsoft announced plans to add the NVIDIA H200 Tensor Core GPU to its Azure fleet next year to support larger model inferencing with no reduction in latency. This new offering is purpose-built to accelerate the largest AI workloads, including LLMs and generative AI models. The H200 GPU brings dramatic increases both in memory capacity and bandwidth using the latest-generation HBM3e memory.

Compared to its predecessor, this new GPU will offer 141GB of HBM3e memory (1.8x more) and 4.8 TB/s of peak memory bandwidth (a 1.4x increase).

Further expanding the availability of NVIDIA-accelerated generative AI computing for Azure customers, Microsoft announced another NVIDIA-powered instance: the NCC H100 v5.

These Azure confidential virtual machines (VMs) with NVIDIA H100 Tensor Core GPUs allow Azure customers to protect the confidentiality and integrity of their data and applications in use, in memory, while accessing the unsurpassed acceleration of H100 GPUs. These GPU-enhanced confidential VMs will be coming soon to private preview.