cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

mfried
Adept II

Does pragma unroll on CPU crash for you, too?

Just installed Catalyst 11.11, and all my OpenCL kernels with #pragma unroll when built on a CPU device abort in clBuildProgram -- an Access Violation in AMD's DLL somewhere. APP KernelAnalyzer also crashes generating x86 code. Kernel is below.

__kernel void MatrixMultiplySimple( __global float* C, __global const float* A, __global const float* B, const int width ) { const int x = get_global_id( 0 ); const int y = get_global_id( 1 ); float sum = 0.0f; #pragma unroll 4 for( int k = 0; k < width; ++k ) sum += A[ y * width + k ] * B[ k * width + x ]; C[ y * width + x ] = sum; }

0 Likes
2 Replies
notzed
Challenger

Originally posted by: mfried Just installed Catalyst 11.11, and all my OpenCL kernels with #pragma unroll when built on a CPU device abort in clBuildProgram -- an Access Violation in AMD's DLL somewhere. APP KernelAnalyzer also crashes generating x86 code. Kernel is below.

 

I recently  moved an application from nvidia to amd ... and had to remove all the unroll pragmas.

From compiler crashes to invalid results to extremely slow code, It was just easier to remove all of them than to figure out what was wrong.  I guess the other thing is they don't seem to be as useful for a few extra % of perfromance than they did on nvidia anyway and the default compilation is usually ok enough.  (or perhaps the corollary: without a profiler to tell me how slow the code is to the microsecond, I care less about such details).  In hindsight, it was just a freebie to try which sometimes worked, but it's really just a workaround for a thick compiler.

I'm still using 11.9 FWIW.

 

0 Likes

Hi mfried,

I tried to generate ISA for CPU for the kernel you have posted. I am able to generate ISA with lopp unrolling with internal version of Driver.

So I hope you find this fixed in the upcoming release.

0 Likes