NVIDIA Talks How AI Will Transform Graphical Computation On Consumer GPUs: Transforming Pixels Into Visual Perception

NVIDIA has been at the forefront of bringing AI's prowess to consumers with its GPUs, and several new technologies are coming that will usher in a new era of gaming and graphical fidelity in the coming years.

The influx of artificial intelligence into every mainstream application known to the human world looks to be imminent, given that major tech giants, such as Microsoft and Amazon, are engaged in a race of AI integration, simply due to the massive advantages the technology brings, whether it is in the form of advanced computing, or even improving consumer experience.

However, NVIDIA is at the very top when it comes to utilizing AI in computing and enterprise applications, a notable example being NVIDIA's ACE. The firm now looks determined to incorporate the power brought by AI to enhance graphical performance. This post isn't mainstream news but rather a narration of the chain of events that shows how big of a role AI will play in the graphical performance of our GPUs, ultimately opening new doorways in the way developers visualize their ideas into something "graphically."

In the modern era of graphics, the race lies in how architecture can upscale native resolutions, allowing consumers to view the best scenarios through "artificially fueled" means, such as NVIDIA's DLSS and AMD's FSR. NVIDIA says that with the influx of AI, they have managed to enhance "7 out to 8" pixels displayed to the users, potentially "quadrupling" the resolution in some cases. Through this, the company has managed to allow developers to "re-create" older titles by fueling it with the power of DLSS.

For a bit of a background check, NVIDIA's prior ray-tracing pipelines were dependent upon several elements, starting from denoisers (which are used to clear out noise artifacts) to an anti-aliasing filter, to achieve the result we saw with technologies such as DLSS 2. While this did the job, it didn't allow developers to leverage the benefits from "image upscaling" since the pipeline couldn't support it in simple terms, although there's a whole lot of complexity encapsulated in this statement.

So, moving forward, the integration of image upscaling was necessary, and to do this, NVIDIA decided to introduce "Ray Reconstruction." This particular mechanism changed the way ray-tracing pipelines were dealt with, and instead, a "unified denoiser" approach was adopted. Ray Reconstruction adopts multiple AI models designed to handle highly dynamic signals, such as moving shadows, light sources, and objects, and in return, provides performance that is significantly improved compared to handcrafted denoisers.

Ray Reconstruction will prove to be a massive breakthrough in the domain of "AI-powered" graphics, given that it optimizes the ray tracing process, making it accessible for every consumer out there and even removing the hardware limitations in some cases. Not only this, but it has also removed the boundaries of graphical computation, allowing developers to deliver stunning visuals and even recreate classic titles to ensure their "graphically transition" into the modern era.

But what comes next? NVIDIA's John Burgess recently highlighted some of the emerging trends for AI graphics during the High-Performance Graphics event, talking specifically about consumer-level GPUs such as RTX for GeForce and RTX for Workstations. It is mentioned that AI can help in various rendering tasks other than post-processing such as DLSS. This is something that was also hinted at by Intel's TAP who sees the application of AI beyond just upscaling & frame generation. An example of that was also shown by NVIDIA during the same event which we will get to in a bit.

Some of the new approaches laid out by NVIDIA include:

Neural Texture Compression (our detailed post here)

Real-time Neural Appearance Models (our detailed post here)

NeuralVDB

Neural Radiance Cache (our detailed post here)

The first example of leveraging AI to enhance graphics fidelity is Neural Texture Compressions which use a small MLP (Multi-Layer Perceptron) which is an artificial neural network consisting of connected neurons. The Neural Texture Compression model utilizes 1 MLP network per material texture stack and includes 2 hidden layers.

The model allows for a 4-16x compression uplift over standard block compressed textures or BCs. This allows for higher-resolution textures in the same memory footprint and equivalent resolution in a much smaller memory footprint, allowing GPUs with limited VRAM, Bandwidth & cache to handle higher-resolution textures more effectively.

Next is NeuralVDB which represents compressed volume data as well as sparse tree topology. It utilizes 2-4 MLPs per volume with 3-4 hidden layers and achieves a 10-100x compression ratio. At SIGGRAPH 2022, NVIDIA showcased how the model can be utilized to run complex volumetric simulations with up to a 100x memory footprint reduction.

Lastly, there's the Neural Radiance Cache which uses a neural network to encode radiance information. The model uses a single MLP network per probe with 2 hidden layers and can dynamically update (training) and query (inference) probes. Adding the Neural Radiance Cache to a Path Tracing render improves the sample quality dramatically. This helps conversion faster or can make denoising the render easier.

Now that we are out of ray tracing let's talk a bit about shading performance and real-time rendering, which are the basic building blocks of graphical computation. NVIDIA has brought up a pretty interesting way to sort this segment out, and as highlighted in previous coverage, the firm has leveraged the capabilities of neural materials models, which we'll explain later on. For now, Team Green has managed to frame neural networks in a way that they now can enhance graphical computation to new levels, which were never imagined.

Back at the SIGGRAPH 2024 keynote, NVIDIA introduced Neural Appearance Models or NAMs, which utilize AI to represent and render the appearance of materials in a more realistic manner through methods that are highly optimized from the traditional ones. These neural models are trained in a way that they know the visual characteristics of real-world materials, and by applying such datasets to rendering images, they create an end-product that is not only highly realistic but much faster.

While explaining the intricacies of Neural Networks is challenging enough, especially for an average reader, we'll try to sum up things in the most effective manner. You see, the Neural Appearance Models we talked about earlier are built upon certain blocks, which include 2 MLPs (Multi-Layer Perceptrons), one for BRDF evaluation & one for importance sampling and data sampling, but scaled up to a whole new level. Along with that, the NAMs utilize an encoder-decoder architecture, which involves processing input data and generating the final material appearance based on parameters along with the dataset.

Now that you have some idea of how NAMs work, let's move on to the interesting part: their potential capabilities. NVIDIA's showcase of Neural Appearance Models revealed that they are capable of offering up to 16K texture resolution rendering, which is a massive leap. Apart from this, computationally efficient neural networks on board are said to decrease rendering times by a whopping 12-24 times, yet again a gigantic achievement, given that with the traditional shading graph technique, this was deemed impossible.

Looking forward, NVIDIA believes that MLPs and AI in general can achieve some big improvements in the world of graphics:

Simple MLPs are surprisingly powerful (for):

Data Compression

Approximation of complex math

Caching of complex signal data

Performance can be competitive with traditional rendering:

Layer fusion

Leveraging reduced precision and sparsity

Since MLPs are small, their performance can be competitive with traditional rendering so these enhancements won't come at a huge cost.

Challenges:

Divergence

Intermix with traditional shader core

Some challenges include divergence which is essentially saying that if each thread within the GPU is querying/running its own neural network for its texel value then they have to overcome divergence since these threads are meant to work together. So there are both execution divergence and data divergence.

An example of future rendering with AI was showcased in the form of a recent OpenAI video created using Sora which shows a jeep running through a rugged dirt terrain with a realistic dust trail being left behind and the vehicle showcasing realistic weight and simulation based on the terrain. It is an entirely AI-created video that gives us a glimpse of the future applications of AI such as games. This small animation took 10s of thousands of GPUs to train using a small text prompt but as AI hardware becomes more powerful, we can probably see it coming to consumer GPUs in the years ahead.

Other interesting takes from the session were the comments on using dedicated hardware such as NPU versus GPU and comments on other future accelerators that might be integrated within future GPUs:

The problem i see with dedicated hardware that isn't tied to a GPU is that you lose the entire ecosystem that you can use to do all the things around the neural networks you are using and that's why we prefer having it pretty tightly coupled so that you can do programmable code on an SM then go to the tensor core and go back as often as you need to for your particular problem and so somebody could go out and build their own dedicated hardware that just ran convolutional encoders but how you can do all the other stuff that you need to do.

I don't want to talk about any future product or anything but until I saw the neural appearance models start to form and work, I spent a lot of time thinking about how to accelerate materials like if there was an amazing Uber material, you know Arnold shader, you know that could take some complex thing with 50 inputs and build some dedicated hardware that just blast through it or samples multi-layer with Monte Carlo or something you know, position free Monte Carlo, and it led basically nowhere but I think there's not an obvious next disruption that's going to come from dedcated hardware that I can see. I think the next disruption is going to be about using the tolls we've just enabled to take that next step without needing hardware because the hardware is already accelerating the building blocks from which we could build something new and exciting. And I think the neural appearance models is a good example of that,we already built the hardware, we just didn't know that it was good for that until we tried it.

John Burgess - NVIDIA

It's safe to say that the influence of AI over the world of computation can't be measured yet, given that the technology has offered endless possibilities, and its influence over graphical computation is a small example of potential capabilities. With Neural Appearance Models and Ray Reconstruction, the next-gen graphics are surely going to reach the point where we once dreamt of, credit to the efforts made by NVIDIA and the team, not to mention the role hardware power plays here.