6 Replies Latest reply on Jun 18, 2010 6:15 PM by koarl0815

# Simple OpenCL kernel freezes system with Stream SDK 2.1

Hello,

The attached OpenCL kernel is supposed to compute a plane rotation of two vectors. For some reasons, it leads to a system freeze under Stream SDK 2.1, while it works correctly under another OpenCL implementation. Does anyone have an idea why this could be the case? Could it be related to a bug in the Stream SDK?

Again, the WARNING: The code below may freeze your system.

Best regards,
Karli

////// plane rotation: (x,y) <- (\alpha x + \beta y, -\beta x + \alpha y) __kernel void plane_rotation( __global float * vec1, __global float * vec2, float alpha, float beta, unsigned int size) { float tmp1 = 0; float tmp2 = 0; for (unsigned int i = get_global_id(0); i < size; i += get_global_size(0)) { tmp1 = vec1[i]; tmp2 = vec2[i]; //the following barrier is not needed, but does not improve the situation. barrier(CLK_GLOBAL_MEM_FENCE); vec1[i] = alpha * tmp1 + beta * tmp2; vec2[i] = alpha * tmp2 - beta * tmp1; } };
• ###### Simple OpenCL kernel freezes system with Stream SDK 2.1

Which OS/GPU/Driver are you using?

Also could you post the runtime code? A compilable test-case would make it easy to reproduce and track-down the problem.

• ###### Simple OpenCL kernel freezes system with Stream SDK 2.1

Well, Murphy's Law just hit me again. For the simple test code, it is now working again - the barrier caused some problems.

However, I remember that I have added the barrier *because* the kernel wasn't working as expected and system freezes occured. Anyway, I will come back to that as soon as our test system is finally up and running...

• ###### Simple OpenCL kernel freezes system with Stream SDK 2.1

Are you using a 7xx GPU with group-size larger than 64 for this kernel?

I have seen driver resetting using above combination before. Its always better to query group-size from runtime using function - clGetKernelWorkGroupInfo

• ###### Simple OpenCL kernel freezes system with Stream SDK 2.1

Thanks for the input! I have first observed the freezes using a new 64 bit linux kernel (I think it was 2.6.33). Some more days passed by and the test system finally became unavailable, so I've tried it today on a Windows 7, 64 bit, where it worked. The GPU was a Radeon HD 5850, so the group size was not an issue. A linux test system should be back soon...

• ###### Simple OpenCL kernel freezes system with Stream SDK 2.1
for (unsigned int i = get_global_id(0); i < size; i += get_global_size(0))
barrier()
This is illegal in OpenCL. If a barrier is inside control flow, then every thread must hit the barrier on each iteration of the loop. Because the loop count is different for each thread, the result are undefined, and in the case of running on our 5XXX GPU's, your kernel deadlocks the hardware.
• ###### Simple OpenCL kernel freezes system with Stream SDK 2.1

Yeah, a barrier within a for-loop is certainly a bad thing if it can't be reached by all threads...