AMD Launches ROCm 6.1.3 Open AI Software Alongside Radeon PRO W7900 Dual Slot GPU

AMD Launches ROCm 6.1.3 Open Compute AI Software Alongside Radeon PRO W7900 Dual Slot GPU 1

AMD has officially launched its new ROCm 6.1.3 Open software suite with added AI capabilities & support alongside the Radeon PRO W7900 Dual Slot GPU.

Announced at Computex 2024, AMD is formally launching its Radeon PRO W7900 Dual Slot GPU in the retail segment. Priced at $3499 US, the Radeon PRO W7900 is said to offer leadership performance-per-dollar and a vast 48 GB memory pool, enabling faster performance within AI-oriented workloads such as LLMs (Large Language Models).

Coming straight to the specifications, the AMD Radeon PRO W7900 graphics card comes in a dual-slot design, making it a perfect solution for workstation setups which can house up to four of these AI-ready behemoths. Internally, the graphics card features the Navi 31 XTX GPU core with 6144 cores packed within 96 compute units. The GPU is equipped with a 384-bit wide bus & comes loaded with 48 GB of GDDR6 memory, delivering up to 864 GB/s of bandwidth. The card also packs 96 MB of Infinity Cache to take things up a notch further, providing even more bandwidth to this high-end GPU.

Following are some of the main highlights of the graphics card:

Impressive Performance – Featuring 96 compute units, with192 AI and 96 ray accelerators it delivers up to an exceptional 61.32 TFLOPs at peak single precision (FP32) and 122.64 TFLOPS of peak half precision (FP16) performance.

Leadership Performance-Per-Dollar – Delivers up to 38% better performance-per-dollar in Llama 3 70B Q4 than the competitive offering with the ability to fit the 70B parameter model on a single GPU framebuffer.

Exceptional Memory – Enables seamless multitasking and handling complex projects effortlessly with 48GB of memory equipped with error-correcting code (ECC) technology to ensure data integrity.

AI on the Desktop – A local PC or workstation equipped with up to four AMD Radeon PRO W7900 Dual Slot graphics cards adds powerful AI performance to any IT infrastructure, is ideal for mission-critical projects, and can help keep sensitive data in-house.

Besides the Radeon PRO W7900 Dual Slot GPU launch, AMD is also launching its ROCm 6.1.3 open software suite for public release. The new ROCm software brings enhanced accessibility & widened support for consumer-tier graphics cards under the Radeon and Radeon PRO series. The release highlights include:

Multi-GPU support to enable building scalable AI desktops for multi-serving, multi-user solutions.

Beta-level support for Windows Subsystem for Linux, allowing these solutions to work with ROCm on a Windows OS-based system.

TensorFlow Framework support offers more choices for AI development.

AMD states that ROCm 6.1.3 supports up to four qualified Radeon RX and Radeon PRO GPUs (Radeon RX 7900 XTX, 7900 XT, 7900 GRE, Radeon PRO W7900, PRO W7900 DS, PRO W7800). A combination of four of any of these GPUs can now be plugged directly into workstations and customers can harness expanded performance, scalability, and accessibility capabilities. Each GPU can be made to compute inference independently and output the response. The ROCm 6.1.3 software stack also adds support for Windows Subsystem for Linux, also known as WSL 2, in BETA.

More ML performance for your desktop

With today’s models easily exceeding the capabilities of standard hardware and software not designed for AI, ML engineers are looking for cost-effective solutions to develop and train their ML-powered applications. Due to the availability of significantly large GPU memory sizes of 24GB or 48GB, utilization of a local PC or workstation equipped with the latest high-end AMD Radeon 7000 series GPU offers a robust/potent yet economical option to meet these expanding ML workflow challenges.

Latest high-end AMD Radeon 7000 series GPUs are built on the RDNA 3 GPU architecture, featuring more than 2x higher AI performance per Compute Unit (CU) compared to the previous generation now comes with up to 192 AI accelerators offers up to 24GB or 48GB of GPU memory to handle large ML models

featuring more than 2x higher AI performance per Compute Unit (CU) compared to the previous generation

now comes with up to 192 AI accelerators

offers up to 24GB or 48GB of GPU memory to handle large ML models

via AMD

With up to 48 GB memory pool available per GPU, you can get a combined pool of up to 192 GB memory capacities which will be accessible and usable for your AI needs.

While PyTorch and ONNX remain the go-to choice, ROCm 6.1.3 also adds qualification for Tensorflow framework, giving users an added choice for AI development.