AMD Instinct MI300X Makes First Appearance In MLPerf v4.1 AI Benchmarks, Tested With Next-Gen EPYC Turin “Zen 5” CPUs
AMD Instinct MI300X Makes First Appearance In MLPerf v4.1 AI Benchmarks, Tested With Next-Gen EPYC Turin “Zen 5” CPUs

AMD's Instinct MI300X AI accelerators have made their first appearance at MLPerf v4.1 and have been tested with next-gen EPYC "Turin" CPUs.
Today, AMD is sharing the first performance benchmarks of its latest data center and AI-centric hardware at MLPerf Inference v4.1. These workloads are designed to showcase the potential of the latest and upcoming hardware from various tech giants such as AMD, Intel & NVIDIA.
The red team is sharing its first submissions of the Instinct MI300X accelerator at MLPerf ever since the chip was introduced while also giving us a taste of the upcoming EPYC Turin CPUs which are the 5th Gen server lineup based on the Zen 5 core architecture.
For the performance evaluation, AMD submitted the results of its Instinct MI300X AI accelerators running on a Supermicro AS-8125GS-TNMR2 system. Four results were submitted at MLPerf v4.1 with two of them under the Offline scenario and two under the Server scenario. The difference is that two of these tests were conducted using the 4th Gen EPYC "Genoa" CPUs and the other two results were conducted using the upcoming 5th Gen EPYC "Turin" CPUs.
Submission ID 4.1-0022: 8x AMD Instinct MI300X accelerators with 2x Intel(R) Xeon(R) Platinum 8460Y+ in the Available category
In addition to AMD submissions, Dell validated platform-level performance of AMD Instinct accelerators by submitting their results with LLaMA2-70B on an 8x AMD Instinct MI300X setup using their PowerEdge XE9680 server.
via AMD
Looking at the performance results in LLama2-70B, AMD achieved 21,028 tokens/s in the server and 23,514 tokens/s in the offline scenario running on the EPYC Genoa CPUs while 5th Gen EPYC "Turin" CPUs with the same Instinct configuration offer 22,021 tokens/s in server and 24,110 tokens/s in the offline scenario. This marks a 4.7% and 2.5% improvement over the Genoa CPU platform.
Compared to the NVIDIA H100, the Instinct MI300X is slightly slower in server performance while the difference gets larger in the offline scenario. The Turin configuration does end up faster by 2% in the server scenario but lags in the offline scenario. These results seem to match the ones that NVIDIA has published in its own announcement. AMD has also showcased near-perfect scaling in Llama2-70B using a 1 GPU and 8 GPU comparison.
Lastly, AMD highlights the memory advantage offered by its Instinct MI300X AI accelerators which far exceeds what's offered on the NVIDIA H100 platform. MI300X offers enough memory to meet the requirements of the largest language models across a variety of data formats.
We’re excited to continue showcasing the versatility and performance of AMD Instinct accelerators across future benchmarks as we expand our testing and optimization efforts. This is just the beginning of our journey. In the coming months, we plan to launch the next iterations of the AMD Instinct series, featuring among other advances, additional memory, support for lower precision data types, and increased compute power. Future ROCm releases target bringing software enhancements, including kernel improvements and advanced quantization support. Stay tuned for our next MLPerf submission—we look forward to sharing our progress and insights with you.
via AMD
AMD isn't done here as it aims to solidify its ROCm stack with more optimizations towards AI so we can see updates to the performance in the next iteration of MLPerf submissions. While AMD took a good amount of time to submit MI300X numbers, we can hope that the MI325X, which debuts next quarter, will have results submitted much earlier as it's a major product that delivers a 50% capacity increase over MI300X. AMD's EPYC Turin "Zen 5" CPU is also expected to launch later this year so stay tuned.
What's Your Reaction?






