I am trying to run ollama in a docker configuration so that it uses the GPU and it won’t work. I also am able to run GPT4ALL with Vulkan drivers and it goes fast at text generation, but that's outside docker and I want to run ollama within docker for certain reasons.
I am a user of the operating system Pop! OS.
I chose Pop! OS over Ubuntu regular because I hoped the video drivers for my GPU would run better for gaming, programming, and science. I have an AMD GPU.
I have a AMD® Ryzen 7 8840u w/ radeon 780m graphics x 16 and AMD® Radeon graphics
I could add an external GPU at some point but that’s expensive and a hassle, I’d rather not if I can get this to work. The speed on GPT4ALL (a similar LLM that is outside of docker) is acceptable with Vulkan driver usage. Ollama in docker is clearly using CPU generation based on the slow output. It’s very slow, about 1/10th the speed of the Vulkan generation in GPT4ALL. I've never used a AMD GPU before and I am frustrated by the difficulty of the setup. NVIDIA was not as hard.
Using ollama in a docker is helpful for different programming or experimental applications. I have no idea what I am doing wrong because there are so many guides on what to do. I added amdgpu.deb package from the amd website so that it added a repo and I also installed rocm and amdgpu latest drivers. I also am running a llm rocm docker image from amd, although I am not sure if that’s helping or needed and don’t even understand what it does.
When I run docker with ollama/ollama:rocm it indicates it doesn’t recognize my graphics card:
:1153: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2024-10-11T11:30:20.215Z level=INFO source=images.go:753 msg="total blobs: 5"
time=2024-10-11T11:30:20.215Z level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-10-11T11:30:20.215Z level=INFO source=routes.go:1200 msg="Listening on [::]:11434 (version 0.3.12)"
time=2024-10-11T11:30:20.216Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 rocm_v60102]"
time=2024-10-11T11:30:20.216Z level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-10-11T11:30:20.218Z level=WARN source=amd_linux.go:60 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-10-11T11:30:20.220Z level=WARN source=amd_linux.go:341 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1103 library=/usr/lib/ollama supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-10-11T11:30:20.220Z level=WARN source=amd_linux.go:343 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-10-11T11:30:20.220Z level=INFO source=amd_linux.go:361 msg="no compatible amdgpu devices detected"
time=2024-10-11T11:30:20.223Z level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered"
There's not universal answer to this problem, as many guides say different things and none of them have gotten this to work.
Hi @tawhid sorry about your experience so far. I'm Michael, one of the maintainers of the Ollama project.
This is an issue with ROCm having a small support matrix. A similar issue was filed on Ollama. https://github.com/ollama/ollama/issues/3189
There is a workaround by forcing a specific GFX version `HSA_OVERRIDE_GFX_VERSION` that others have found to be successful.
We don't force unsupported AMD cards by default because with each ROCm update, it could break for the user. Sorry about this.
Thank you! I will try this out! I should have checked the github issues! I wasn't sure if the issue was a docker issue, an ollama issue, a pop! os issue or amdgpu issue and just didn't know what to do next!
Thank you for your help! Thanks for maintaining the project!
I am not sure how to use the github ideas in docker
Is there a correct way to implement this in docker?
I have ollama/ollama:rocm. I am not sure if I do something like
"docker exec -it ollama/ollama:rocm HSA_OVERRIDE_GFX_VERSION="11.0.0" ollama serve &"
in a terminal if that would get any result?
The settings on https://github.com/alexhegit/Playing-with-ROCm/blob/main/inference/LLM/Run_Ollama_with_AMD_iGPU780M-... discuss running commands for ollama and I am using ollama/ollama:rocm inside of docker
I don't know if I should be trying to alter the image file that was pulled or if this won't work since it's an image
This was extremely frustrating, but ollama appears to be incompatible with adrenalin 24.10. I was only able to get it to work on windows and wsl ubuntu with adrenalin 24.6 or 24.7.