I'm looking at:
How to run a Large Language Model (LLM) on your AM... - AMD Community
I installed LM Studio, everything went fine, I downloaded a Q4 model, I think it was Mistral 7B, I asked it how to make chili and it gave me a recipe in about a minute, 10 tokens per second... with the cpu grinding at about 65%.
My question is how do I know if this is using the AMD IPU? I also turned on the GPU and on this machine that was slower like 7 tokens per second but I don't think this computer has the greatest GPU. So is 10 tokens per second on a 7840... 4 gigahertz, 32 gigabytes of ram normal / accelerated?
Any advice greatly appreciated.
Solved! Go to Solution.
The blog post regarding LM Studio is related to Ryzen AI iGPU... NPU is not related.
Hello Henry,
I did the same, HW64INFO did not report any NPU usage. I did some research: According to hothardware.com LM Studio is based on llama.cpp project and they don't support AMD's XDNA NPU's yet.
Juergen
That's so weird. Why would AMD staff make the blog post then?
And the CPU usage is weird too. 65% when it is off-loading to GPU and 65% when it is not off-loading to GPU with better tokens per second. On my other PC the CPU would peg out when there is no GPU off-load.
I'm leaning toward it works, but there's no way to prove it 😐
The blog post regarding LM Studio is related to Ryzen AI iGPU... NPU is not related.
Experience Meta Llama 3 with AMD Ryzen™ AI and Rad... - AMD Community
And i guess this one has nothing to do with NPUs either?... Even though it says "AMD Ryzen™ Mobile 7040 Series and AMD Ryzen™ Mobile 8040 Series processors feature a Neural Processing Unit (NPU) which is explicitly designed to handle emerging AI workloads." right in the article.
I downloaded followed the instructions but HWiNFO64 reports 0% NPU Usage. 😞
I am disappointed as well. As an early Adaptor of a promising technology I am searching for cool applications and this would have been one. The article you linked seems to state that But in Reality it is not the case
As RyzenAI PC have options for both iGPU and NPU, some blogs could only show iGPU or NPU capability. In this repo https://github.com/amd/RyzenAI-SW you can find several NPU related examples including Llama2. We have plan to update for LLM scripts/example in coming weeks/month.
Thanks
Thank you for clarifying this matter. The description of LM Studio in that post was too misleading. The development of amd/RyzenAI-SW seems to be in quite an early stage. I hope there will be a friendly C++ API for the GitHub community to facilitate further development.
I implemented the development environment and finally got an example running. Here I can see the NPU usage in HWINFO64. So I am pretty sure, that the NPU was not used by LM Studio. Otherwise the usage would have been reported and the NPU including driver works - if used. Maybe I need to choose another model in LM Studio but I .
The fact, that pure CPU is faster than GPU maybe due to the limited capacity of the embedded GPU as you say. Using AVX in the processor may be more eficient than sharing the load with the GPU.
Can you run any of the transformers examples?
RyzenAI-SW/example/transformers at main · amd/RyzenAI-SW · GitHub
Today, we're diving into the world of AI processing power with the Ryzen AI IPU. Powered by LM Studio, this dynamic duo unlocks a realm of possibilities. From generating lifelike text to crafting immersive experiences, the synergy between Ryzen AI IPU and LM Studio revolutionizes creativity. Join us as we explore the cutting-edge capabilities, unleashing the full potential of AI-driven innovation. Ryzen AI IPU and LM Studio — where imagination meets computation