Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Adept I

device=GPU compiler hangs

program compiles and runs with device=CPU, but opencl compiler hangs with device=GPU

A non trivial OpenCL 7000 line OpenCL program compiles and executes correctly using the CPU as a target.

 cl::Context context(CL_DEVICE_TYPE_CPU, ...
(everything OK, program builds and executes correctly)

The OpenCL compiler hangs when we try compiling the same code specificing the GPU as thte target device

cl::Context context(CL_DEVICE_TYPE_GPU, ...

The primary environment is

   Intel i7 980 CPU (Dell XPS 9100)
   Sapphire HD 6970
   either Ubuntu 10.04 LTS AMD64 or WIN7 VS 2008
   Catalyst 11_8
   AMD APP 2.5

The choice of OS makes no differnce:  device=CPU works but with device=GPU, the OpenCL compiler hangs.

We have tried Catalyst 11_5, 11_6, 11_7, 11_8 and APP 2.4 as well as 2.5 with identical results in both the Ubuntu and WIN7 environments.

We have tried using gnu gcc in C99 mode (-std=c99) to search for questionable or bad syntax.


3 Replies

This is a known issue. Large OpenCL programs on the GPU can cause exponential increase in compilation time. The only known work-around is to use smaller kernels.

What is the recommended work around?  How do we minimize compile time?

We need to understand the guide lines we can follow if we have lengthy pieces of code that needs to be executed.

Are there compile time trade offs between many short subroutines verses fewer longer subroutines?

Should we favor local variables stored in a structure, pass the structure to subroutines as a single argument, or pass variables directly as subroutine multiple parameters?  What compiles faster?


On the GPU, function calls are not supported, so everything gets inlined, which causes the problems. The problem isn't how things are coded, but the fact that after everything gets inlined, the program itself can be extremely large. While our compiler pushes the inlining as far back as possible, there are still cases that will cause exponential increase in compile time, which is what you are seeing. Usually the increase is caused by the compiler using all of the memory and swapping to the hard drive.