to make sure my kernel would run on all kind of devices, I added an old radeon HD 7700 to my system, next to my R9 nano.
Both are GCN cards, so I thought it would run without too much problems.
But it seems more difficult than that:
When initializing all devices, the program went into segmentation fault when initializing the context with multiple GPUs
so I thought: not a problem, I have 128GB of RAM on this machine, I don't care if it takes a bit more memory, I'll just create another context for each GPU (tell me if this idea is stupid).
and then, the segmentation fault was gone, but this was the output of my program:
--------------------------------------------------------
CL_SUCCESS
Context creation for device 0 : CL_SUCCESS
Program creation for device 0 : CL_SUCCESS
Compile Program for device 0 : CL_SUCCESS
attention! cl device Fiji is working in the mines from now ... RIP CL_SUCCESS
Context creation for device 1 : CL_SUCCESS
Program creation for device 1 : CL_SUCCESS
Compile Program for device 1 : CL_BUILD_PROGRAM_FAILURE
Internal error: Link failed.
Make sure the system setup is correct.
CL_SUCCESS
Context creation for device 1 : CL_SUCCESS
Program creation for device 1 : CL_SUCCESS
Compile Program for device 1 : CL_SUCCESS
attention! cl device pthread-AMD Opteron(TM) Processor 6276 is working in the mines from now ... RIP CL_SUCCESS
WriteBuffer success
WriteBuffer success
Kernel success
Kernel success
^C
I am guessing I am missing some GCN-1.0 libraries. Do the linux 17.30 drivers support the radeon 7700 cards?
thanks
*edit:
sorry, I just saw I forgot the program source, here it is
I already posed a lot of questions here, so tell me if It gets annoying
*edit2:
confirmed to be a problem with the 4.13-rc4 kernel instead of the previous versions tested (4.9.20, 4.10.17), but I don't think this is a kernel-related problem.
Solved! Go to Solution.
ok, for people reading this, I have a proof it is a compiler problem:
while my code is still dirty, it actually compiles on mesa 17.2-rc3!
on amdgpu-pro it still crashes with link failed.
hopefully, the amd engineers will fix this soon, because my program causes GPU faults, and doesn't run correctly on no gpu (while on amdgpu-pro it does on R9 nano)
*edit:
as always, cleaning up the code removed the link failure!
while I have no idea what exactly went wrong (and the output is still wrong), it now compiles.
so no blame for amd, I'd rather be more careful reporting bugs
ok, for people reading this, I have a proof it is a compiler problem:
while my code is still dirty, it actually compiles on mesa 17.2-rc3!
on amdgpu-pro it still crashes with link failed.
hopefully, the amd engineers will fix this soon, because my program causes GPU faults, and doesn't run correctly on no gpu (while on amdgpu-pro it does on R9 nano)
*edit:
as always, cleaning up the code removed the link failure!
while I have no idea what exactly went wrong (and the output is still wrong), it now compiles.
so no blame for amd, I'd rather be more careful reporting bugs