NVIDIA Hopper H100 AI GPU Benchmarked – Slower Than AMD 680M iGPU In Gaming But Destroys RTX 4090 In AI Tests

NVIDIA Hopper H100 AI GPU Benchmarked – Slower Than AMD 680M iGPU In Gaming But Destroys RTX 4090 In AI Tests

 0
NVIDIA Hopper H100 AI GPU Benchmarked – Slower Than AMD 680M iGPU In Gaming But Destroys RTX 4090 In AI Tests

The NVIDIA Hopper H100 is currently the fastest GPU on the planet for HPC & AI workloads, making it the most popular chip on the market right now.

Since the AI explosion, the Hopper H100 GPU has seen big demand and the company has pooled all of its resources to increase the production of the said chip just to meet this massive demand. Chinese content creator, Geekerwan, brings the first look at the performance of the chip running on a standard PC in up to 4-Way configuration with multiple creation apps including gaming benchmarks.

The NVIDIA H100 GPU is a very expensive chip to get hands on within China. We have seen units cost around $30,000 and up to $50,000 US. So four of these graphics cards would cost over $100 grand which is insane. To test these GPUs on the DIY PC setup, a 3D-Printed duct had to be made to deliver cooling to the card since it comes with a passive heatsink which means there's no active cooling solution onboard the card. An NVIDIA GeForce GTX 1650 Ti graphics card was also used as a secondary display card since the H100 doesn't offer any display outputs and is intended to be used as an accelerator card.

The variant of the H100 used was the 80 GB PCIe, featuring 114 SMs enabled out of the full 144 SMs of the GH100 GPU and 132 SMs on the H100 SXM. The chip as such offers 3200 FP8, 1600 TF16, 800 FP32, and 48 TFLOPs of FP64 compute horsepower. It also features 456 Tensor & Texture Units with 24 ROPs.

The way the H100 is structured is so that only 2 of its TPCs are available for the standard graphics processing tasks while the entire GPU block is primarily dedicated to compute tasks so that can lead to adverse results in gaming even in the drivers don't support such workloads. The card features an 80 GB HBM2e (2.0 Gbps) memory layout across a 5120-bit bus interface & has a rated TDP of 350W. However

Starting with the benchmarks, the card was first tested within Stable AI Diffusion benchmarks, and while the H100 was able to generate an image within 2.82 seconds, it was still slower than the RTX 4090. The main issue was due to Xformers which didn't include support for the H100 and hence, it was decided to use a different model, Donut. The Donut tool made use of the transformer engines found on the Hopper H100 GPUs under pytorch 2.0.01 and also enabled support for CUDA 11.8.

The performance here was the complete opposite with the H100 delivering 30% faster speed than the RTX 4090 and RTX 6000 Ada GPUs. The content creator also used up to four H100 GPUs to see the scaling performance and it looks like two H100s offered a further 43% boost but 3-way and 4-way results showed diminishing returns and negative scaling. It looks like standard PCs just can't take advantage of multiple H100 GPUs.

Within the VITS training benchmark, the H100 delivered 23% faster performance versus the NVIDIA RTX 4090 and RTX 6000 Ada GPUs. This is a very memory-intensive benchmark and increasing the batch size doesn't affect the performance of the H100 since it already packs 80 GB memory but the RTX 4090 did lag behind with a higher batch size due to its limited 24 GB VRAM.

Next up, we have a large LLAMA model in ChatGPT with a total of  65 Billion parameters which were manageable on the H100 but the RTX 4090 can only run up to 6 Billion parameters. This shows that for LLM at least, gaming GPUs are not a wise option, and it's better to get a dedicated accelerator. In the last set of benchmarks, HPC workloads such as LAMMPs (28 March 2023) were used and the RTX 4090 nor the RTX 6000 Ada can stand up against the H100 PCIe which obliterated the two offerings.

But how does the card perform in games? Well, 3DMark Time Spy and Red Dead Redemption 2 were used to test the gaming performance of the NVIDIA H100 GPU and the card ran slower than AMD's Radeon 680M which is an integrated GPU. The problem was due to under-utilization and the non-optimized nature of the drivers which should be expected since the H100 is an HPC/AI-first solution and the company has no official gaming drivers made for the card.

In Red Dead Redemption 2, the card was run with 1080p High settings and DLSS "Balanced" preset and still delivered sub-30 FPS. Once again, you can see that the power of the card is under 100W & that's showing major under-utilization of the H100 GPU.

So NVIDIA's H100 is what it is said to be, a great card for AI and HPC workloads and tat's about it. It is a very expensive accelerator but since there's no competition out there to match it, the green team can get away with the prices until AMD and Intel offer more competitive solutions in the same space.

News Source: I_Leak_VN

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow