cancel
Showing results for 
Search instead for 
Did you mean: 

AI Discussions

amd18
Journeyman III

Custom Flags for Ollama in Docker Needed with Rocm 6.2?

I made a post about this before, but lost access to that account due to accidental restart.

 

I am trying to run ollama in docker. There is an image ollama/ollama:rocm for amdgpus, but it won't work with my iGPU 780M of AMD Ryzen CPU.

 

There's ways around this for the non-docker version using terminal commands. Can this be applied to docker somehow? (I also tried to ask about this on the docker community forum to see if there was a way to apply the commands inside the docker but they seemed a bit condescending, said it had nothing to do with docker, and wouldn't help. I also lost access to my account there due to the restart and now they say I am spam and won't let me create an account due to a "server error." I'm not spam.)

 

I am also not sure if this changed with 6.2 of rocm. Is there any way to apply this guide to the docker version of ollama so that I can use ollama in a docker without it being so slow?

 

This is what I tried in terminal:

 

HSA_OVERRIDE_GFX_VERSION="11.0.0" sudo docker run ollama/ollama:rocm
[sudo] password for user:
Couldn't find '/root/.ollama/id_ed25519'. Generating new private key.
Your new public key is:

ssh-(key)

 

2024/10/12 19:08:40 routes.go:1153: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2024-10-12T19:08:40.037Z level=INFO source=images.go:753 msg="total blobs: 0"
time=2024-10-12T19:08:40.037Z level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-10-12T19:08:40.037Z level=INFO source=routes.go:1200 msg="Listening on [::]:11434 (version 0.3.12)"
time=2024-10-12T19:08:40.038Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[rocm_v60102 cpu cpu_avx cpu_avx2]"
time=2024-10-12T19:08:40.038Z level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-10-12T19:08:40.039Z level=WARN source=amd_linux.go:60 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-10-12T19:08:40.039Z level=WARN source=amd_linux.go:202 msg="amdgpu too old gfx000" gpu=0
time=2024-10-12T19:08:40.039Z level=INFO source=amd_linux.go:361 msg="no compatible amdgpu devices detected"
time=2024-10-12T19:08:40.039Z level=ERROR source=amd_linux.go:364 msg="amdgpu devices detected but permission problems block access" error="kfd driver not loaded. If running in a container, remember to include '--device /dev/kfd --device /dev/dri'"
time=2024-10-12T19:08:40.039Z level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered"

 

The github information says:

 

Ollama could run the iGPU 780M of AMD Ryzen CPU at Linux base on ROCm. There only has a little extra settings than Radeon dGPU like RX7000 series.

Keys for usage

  • Ryzen 7000s/8000s CPU with iGPU 780M
  • amdgpu driver and rocm6.0
  • Linux OS is required (Windows and WSL2 are not supported)
  • BIOS must be set to enable the iGPU and dedicate > 1GB RAM to VRAM
  • HSA_OVERRIDE_GFX_VERSION="11.0.0" is set (extral setting for AMD iGPU-780M)

Prerequisites

  1. Set UMA for iGPU in BIOS. (at least >1GB, recommend to >8GB for Llama3:8b q4_0 model size is 4.7GB)

  2. Install GPU Driver and ROCm Refer to

  3. Install Ollama

    curl -fsSL https://ollama.com/install.sh | sh

Steps

The iGPU is not detected by Ollama at default. We need extra steps to enable it.

  1. Stop the ollama.service

    sudo systemctl stop ollama.service

    Then find out the pid of ollama.service by 'ps -elf | grep ollama' and then 'kill -p [pid]'

  2. for iGPU 780 w/ ROCm ( not work in WSL, need run in Linux)

    HSA_OVERRIDE_GFX_VERSION="11.0.0" ollama serve &

  3. Run ollama

    ollama run tinyllama

    Use rocm-smi to watch the utilization of iGPU When run ollama with ROCm.

Another way to replace the step-2 above is to config the ollama.service for iGPU with ROCm as default.

sudo systemctl edit ollama.service

Add the contents into the /etc/systemd/system/ollama.service.d/override.conf

[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"
 

Then Reboot the Linux or just restart the ollama.srevice by,

sudo systemctl restart ollama.service

0 Likes
0 Replies