Intel-Powered Aurora Becomes The Fastest Supercomputer For AI, Finally Breaks The Exascale Barrier

Intel-Powered Aurora Becomes The Fastest Supercomputer For AI, Finally Breaks The Exascale Barrier 1

The Aurora supercomputer has finally broken the exascale barrier & achieved the fastest AI performance with its Intel Ponte Vecchio hardware.

Deployed at the Argonne National Laboratory and built in collaboration with HPE (Hewlett Packard Enterprise), the Aurora Supercomputer was expected to be one of the top performers in the HPC & AI segment. Powered by Intel's Xeon CPU Max and Data Center GPU Max series, the platform was in the race against AMD who managed to be the first to hit the exascale barrier. In the meantime, despite being announced back in 2019, the Aurora supercomputer barely managed to hit its targeted goals but today, the system has reached up to 87% operational capacity or 9234 nodes in total.

At ISC High Performance 2024, Intel announced in collaboration with Argonne National Laboratory and Hewlett Packard Enterprise (HPE), that the Aurora supercomputer has broken the exascale barrier at 1.012 exaflops and is the fastest AI system in the world dedicated to AI for open science, achieving 10.6 AI exaflops. Intel will also detail the crucial role of open ecosystems in driving AI-accelerated high-performance computing (HPC).

via Intel

In terms of specs, the Aurora supercomputer is built using 166 Racks which feature 10,624 blades, 21,248 Intel Xeon CPU Max chips (4th Gen Sapphire Rapids), and 63,744 Intel Data Center GPU Max series units (Ponte Vecchio). It is based on the HPE slingshot fabric for interconnect and makes use of 84,992 endpoints.

In terms of performance metrics, the Aurora supercomputer managed to rank second in the HPL LINPACK benchmark but managed to break the exascale barrier at 1.012 exaflops using just 87% of the total node capacity (9234 nodes vs 10,624). The system also ranked third in the HPCG test with 5612 TFLOPs/second using just 39% of the system.

Utilizing the Xe core architecture and its several AI hardware blocks, the Aurora supercomputer now manages to rank 1st in the AI performance charts with a total rated performance of 10.6 AI Exaflops. The performance was measured using LINPACK-mixed precision (HPL-MxP) benchmark.

What’s Next: New supercomputers being deployed with Intel Xeon CPU Max Series and Intel Data Center GPU Max Series technologies underscore Intel’s goal to advance HPC and AI. Systems include Euro-Mediterranean Centre on Climate Change’s (CMCC) Cassandra to accelerate climate change modeling; Italian National Agency for New Technologies, Energy and Sustainable Economic Development's (ENEA) CRESCO 8 to enable breakthroughs in fusion energy; Texas Advanced Computing Center (TACC), which is in full production to enable data analysis in biology to supersonic turbulence flows and atomistic simulations on a wide range of materials; as well as United Kingdom Atomic Energy Authority (UKAEA) to solve memory-bound problems that underpin the design of future fusion powerplants.

The result from the mixed-precision AI benchmark will be foundational for Intel’s next-generation GPU for AI and HPC, code-named Falcon Shores. Falcon Shores will leverage the next-generation Intel Xe architecture with the best of Intel Gaudi. This integration enables a unified programming interface.

Early performance results on Intel Xeon 6 with P-cores and Multiplexer Combined Ranks (MCR) memory at 8800 MT/s deliver up to 2.3x performance improvement for real-world HPC applications, like Nucleus for European Modeling of the Ocean (NEMO), when compared to the previous generation, setting a strong foundation as the preferred host CPU choice for HPC solutions.