Hi,
I'm new to OpenCL, and I learn it by trying to write simple app.
I've changed MatrixTranspose code to MatrixRotate and stumble upon strange error which seem like a compiler error. I'm using SDK2.3 and HD5450
If i'm using "globalIdx" insted of "(groupIdx*blockSize + localIdx)" I get the right process, but they should contain the same value (been checked) so it is very confusing.
Some explanation please.
__kernel void matrixRotate(__global float * output, __global float * input, __local float * block, const uint width, const uint height, const uint blockSize ) { uint globalIdx = get_global_id(0); uint globalIdy = get_global_id(1); uint localIdx = get_local_id(0); uint localIdy = get_local_id(1); /* copy from input to local memory */ block[localIdy*blockSize + localIdx] = input[globalIdy*width + globalIdx]; /* wait until the whole block is filled */ barrier(CLK_LOCAL_MEM_FENCE); uint groupIdx = get_group_id(0); uint groupIdy = get_group_id(1); /* calculate the corresponding target location for transpose by inverting x and y values*/ uint m = (height -1); uint targetGlobalIdx = m- (groupIdy*blockSize + localIdy) ; uint targetGlobalIdy = (groupIdx*blockSize + localIdx); //!!!!! using globalIdx solve the problem but WHY !!!! /* calculate the corresponding raster indices of source and target */ uint targetIndex = targetGlobalIdy*width + targetGlobalIdx; uint sourceIndex = localIdy * blockSize + localIdx; output[targetIndex] = block[sourceIndex]; }
I run it on a square problem where width = height and the problem occure
The program run OK with
uint targetGlobalIdy = globalIdx;
and I've checked (groupIdx*blockSize + localIdx) and it equal to "globalIdx"
Is those kind of inconsistencies are common at this platform?
Originally posted by: amitporat
I run it on a square problem where width = height and the problem occure
The program run OK with
uint targetGlobalIdy = globalIdx;
and I've checked (groupIdx*blockSize + localIdx) and it equal to "globalIdx"
Is those kind of inconsistencies are common at this platform?
No the host code is the same.
the Kernel code is also very similar to "MatrixTranspose".
Strange indeed....
Thanks for reporting the issue. I will report it to the developer team once i confirm it at my end.