Platform: ROCm 6.1.3
OS: WSL2 Ubuntu 22.04
GPU: AMD Radeon RX 7900 XTX
Hi everyone,
I'm currently working on training a PyTorch deep learning model with an AMD GPU on WSL2. I successfully installed ROCm and all necessary packages, following this guide: https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/wsl/install-pytorch.html
My project involves an image classification model with images sized at approximately 300x300 pixels. When I set batch_size=1, training runs smoothly. Here are the utilization stats from my monitoring panel:
- CPU i7-12: 50%
- RAM: 16GB out of 32GB
- GPU: 20%
- GPU Memory: 4GB out of 24GB
The issue arises when I try to increase the batch size beyond 1. I receive the following error:
RuntimeError: Caught RuntimeError in pin memory thread for device 0.
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing HIP_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
I’ve already configured WSL2 with 128GB for the virtual machine, but the batch size remains limited to 1.
The configuration step is here https://learn.microsoft.com/en-us/windows/wsl/wsl-config
Does anyone have suggestions on how to unlock or increase GPU usage to support larger batch sizes? Any advice would be greatly appreciated!