News Intel's NPU Acceleration Library goes open source — Meteor Lake CPUs can now run TinyLlama and other lightweight LLMs

Admin · Mar 2, 2024

Intel recently released its open-source NPU Acceleration Library, which enables small LLMs on Meteor Lake CPUs.

Intel's NPU Acceleration Library goes open source — Meteor Lake CPUs can now run TinyLlama and other lightweight LLMs : Read more

bit_user · Mar 2, 2024

This seems like just wrappers needed to use the NPU from PyTorch. While that's great, the project doesn't contain the parts that really interest me, such as the code for the "SHAVE" DSP cores.

Source: https://www.tomshardware.com/news/i...meteor-lake-architecture-launches-december-14

jlake3 · Mar 2, 2024

Of course, since the NPU Acceleration Library is made for developers and not ordinary users, it's not a simple task to use it for your purposes.

...so is there no standardized, OS-level framework for all of this new "AI PC"/"AI Laptop" marketing hype to actually tie into? No way for the hardware to tell the OS "Hey, I'm an NPU and here's the data types, standards, and the version levels of those I support"?

I remember hearing that XDNA suffered from a lack of compatible software, the article about the Snapdragon X Elite laptop from the other day mentioned some special setup and requiring the use of the Qualcomm AI stack, and it seems like Intel NPUs as well are something where an average user will need the developer to build support into the program for them.

I'm admittedly not on the generative AI hype train, but there are some uses where I (or people I help with PC things) could be interested in some AI-assisted photo-retouching or webcam background removal that runs on a special accelerator for better performance at lower power (if it's not adding too much to the cost of the PC), and it seems that if I buy something marked for "AI" and touting how many TFLOPS it's NPU can do... there's no guarantee things will actually be able to utilize it now or continue to support it in the future.

bit_user · Mar 2, 2024

jlake3 said:
...so is there no standardized, OS-level framework for all of this new "AI PC"/"AI Laptop" marketing hype to actually tie into? No way for the hardware to tell the OS "Hey, I'm an NPU and here's the data types, standards, and the version levels of those I support"?

I'm not familiar with it, but I guess DirectML might be that, for doing inference on Windows. This looks like a good place to start:

Windows AI

Learn about Windows AI solutions, such as Windows Machine Learning, Windows Vision Skills, and Direct Machine Learning.

learn.microsoft.com

jlake3 said:
it seems that if I buy something marked for "AI" and touting how many TFLOPS it's NPU can do... there's no guarantee things will actually be able to utilize it now or continue to support it in the future.

Check the hardware requirements of the software you want to be AI-accelerated.

IMO, it's really a lot like the situation we have with GPUs, where you need to check not only for HW requirements but even then you need to look for benchmarks to know how well the app actually runs on a given hardware spec. You cannot simply assume that TOPS translates directly into AI performance, any more than you could assume that the TFLOPS and GB/s of a graphics card predicted game performance. Yes, there's a correlation, but also quite a lot of variation.

CmdrShepard · Mar 2, 2024

The situation isn't much different than it was back in the days of first 3D accelerator cards.

CSMajor · Mar 2, 2024

jlake3 said:
...so is there no standardized, OS-level framework for all of this new "AI PC"/"AI Laptop" marketing hype to actually tie into? No way for the hardware to tell the OS "Hey, I'm an NPU and here's the data types, standards, and the version levels of those I support"?

I remember hearing that XDNA suffered from a lack of compatible software, the article about the Snapdragon X Elite laptop from the other day mentioned some special setup and requiring the use of the Qualcomm AI stack, and it seems like Intel NPUs as well are something where an average user will need the developer to build support into the program for them.

I'm admittedly not on the generative AI hype train, but there are some uses where I (or people I help with PC things) could be interested in some AI-assisted photo-retouching or webcam background removal that runs on a special accelerator for better performance at lower power (if it's not adding too much to the cost of the PC), and it seems that if I buy something marked for "AI" and touting how many TFLOPS it's NPU can do... there's no guarantee things will actually be able to utilize it now or continue to support it in the future.

Of course, first you develop and test your model on NPU then you would deploy it via something like DirectML. To do that you could use something like OpenVINO or DirectML. Both which are standards/standardized.

Search

News Intel's NPU Acceleration Library goes open source — Meteor Lake CPUs can now run TinyLlama and other lightweight LLMs

Admin

Administrator

bit_user

Titan

jlake3

Distinguished

bit_user

Titan

Windows AI

CmdrShepard

Prominent

CSMajor

Prominent

TRENDING THREADS

Latest posts

Moderators online

Share this page