Intel Makes Its NPU Acceleration Library An Open-Source Asset, Allowing Devs To Optimize AI Applications

Intel Makes Its NPU Acceleration Library An Open-Source Asset, Allowing Devs To Optimize AI Applications

 0
Intel Makes Its NPU Acceleration Library An Open-Source Asset, Allowing Devs To Optimize AI Applications
Intel Makes Its NPU Acceleration Library An Open-Source Asset, Allowing Devs To Optimize AI Applications 1

Intel has finally "open-sourced" its NPU Acceleration library, allowing developers and enthusiasts to tune their applications to work best with Intel's AI engines.

The news comes from Intel's Tech Evangelist Tony Mongkolsmai, who disclosed the firm's new open-source library in the first place.

With this step, the NPU acceleration library will help developers benefit from NPUs existing in CPU lineups such as the Meteor Lake "Core Ultra" series. It is based on Python, and it simplifies development by providing a high-level interface and supports popular frameworks like TensorFlow and PyTorch, giving developers the power to leverage the library's capabilities for making AI-related tasks more efficient.

For devs that have been asking, check out the newly open sourced Intel NPU Acceleration library. I just tried it out on my MSI Prestige 16 AI Evo machine (windows this time, but the library supports Linux as well) and following the GitHub documentation was able to run TinyLlama… pic.twitter.com/UPMujuKGGT

— Tony Mongkolsmai (@tonymongkolsmai) March 1, 2024

Tony had been running the NPU acceleration library on an MSI Prestige 16 AI Evo laptop, which features the Intel Core Ultra CPUs. He could run TinyLlama and Gemma-2b-it LLM models on the machine without performance disruptions, indicating the potential captivated in Intel's NPUs and how they promote an edge AI environment for developers. Here is how the Intel development team themselves describes the library:

The Intel NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.

In our quest to significantly improve the library's performance, we are directing our efforts toward implementing a range of key features, including:

  • 8-bit quantization
  • 4-bit Quantization and GPTQ
  • NPU-Native mixed precision inference
  • Float16 support
  • BFloat16 (Brain Floating Point Format)
  • torch.compile support
  • LLM MLP horizontal fusion implementation
  • Static shape inference
  • MHA NPU inference
  • NPU/GPU hetero compute
  • Paper
  • via Github Intel

    It is great to see the open-sourcing of the NPU acceleration library, as it would ultimately lead to an enhanced implementation of AI applications running on Intel's dedicated AI engines. It will be interesting to see what sort of developments we see on such engines moving ahead, since, as stated by Tony himself, there is a lot packed in for consumers and developers.

    News Source: Tony Mongkolsmai

    What's Your Reaction?

    like

    dislike

    love

    funny

    angry

    sad

    wow