2 Replies Latest reply on Dec 7, 2011 4:50 PM by himanshu.gautam

    Does pragma unroll on CPU crash for you, too?

    mfried

      Just installed Catalyst 11.11, and all my OpenCL kernels with #pragma unroll when built on a CPU device abort in clBuildProgram -- an Access Violation in AMD's DLL somewhere. APP KernelAnalyzer also crashes generating x86 code. Kernel is below.

      __kernel void MatrixMultiplySimple( __global float* C, __global const float* A, __global const float* B, const int width ) { const int x = get_global_id( 0 ); const int y = get_global_id( 1 ); float sum = 0.0f; #pragma unroll 4 for( int k = 0; k < width; ++k ) sum += A[ y * width + k ] * B[ k * width + x ]; C[ y * width + x ] = sum; }

        • Does pragma unroll on CPU crash for you, too?
          notzed

           

          Originally posted by: mfried Just installed Catalyst 11.11, and all my OpenCL kernels with #pragma unroll when built on a CPU device abort in clBuildProgram -- an Access Violation in AMD's DLL somewhere. APP KernelAnalyzer also crashes generating x86 code. Kernel is below.

           

          I recently  moved an application from nvidia to amd ... and had to remove all the unroll pragmas.

          From compiler crashes to invalid results to extremely slow code, It was just easier to remove all of them than to figure out what was wrong.  I guess the other thing is they don't seem to be as useful for a few extra % of perfromance than they did on nvidia anyway and the default compilation is usually ok enough.  (or perhaps the corollary: without a profiler to tell me how slow the code is to the microsecond, I care less about such details).  In hindsight, it was just a freebie to try which sometimes worked, but it's really just a workaround for a thick compiler.

          I'm still using 11.9 FWIW.