Intel Makes Its NPU Acceleration Library An Open-Source Asset, Allowing Devs To Optimize AI Applications

Intel Makes Its NPU Acceleration Library An Open-Source Asset, Allowing Devs To Optimize AI Applications 1

Intel has finally "open-sourced" its NPU Acceleration library, allowing developers and enthusiasts to tune their applications to work best with Intel's AI engines.

The news comes from Intel's Tech Evangelist Tony Mongkolsmai, who disclosed the firm's new open-source library in the first place.

With this step, the NPU acceleration library will help developers benefit from NPUs existing in CPU lineups such as the Meteor Lake "Core Ultra" series. It is based on Python, and it simplifies development by providing a high-level interface and supports popular frameworks like TensorFlow and PyTorch, giving developers the power to leverage the library's capabilities for making AI-related tasks more efficient.

For devs that have been asking, check out the newly open sourced Intel NPU Acceleration library. I just tried it out on my MSI Prestige 16 AI Evo machine (windows this time, but the library supports Linux as well) and following the GitHub documentation was able to run TinyLlama… pic.twitter.com/UPMujuKGGT

— Tony Mongkolsmai (@tonymongkolsmai) March 1, 2024

Tony had been running the NPU acceleration library on an MSI Prestige 16 AI Evo laptop, which features the Intel Core Ultra CPUs. He could run TinyLlama and Gemma-2b-it LLM models on the machine without performance disruptions, indicating the potential captivated in Intel's NPUs and how they promote an edge AI environment for developers. Here is how the Intel development team themselves describes the library:

The Intel NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.

In our quest to significantly improve the library's performance, we are directing our efforts toward implementing a range of key features, including:

8-bit quantization

4-bit Quantization and GPTQ

NPU-Native mixed precision inference

Float16 support

BFloat16 (Brain Floating Point Format)

torch.compile support

LLM MLP horizontal fusion implementation

Static shape inference

MHA NPU inference

NPU/GPU hetero compute

Paper

via Github Intel

It is great to see the open-sourcing of the NPU acceleration library, as it would ultimately lead to an enhanced implementation of AI applications running on Intel's dedicated AI engines. It will be interesting to see what sort of developments we see on such engines moving ahead, since, as stated by Tony himself, there is a lot packed in for consumers and developers.

News Source: Tony Mongkolsmai