cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

amitporat
Journeyman III

unexplained Segmentation fault

Hi,

I'm new to OpenCL, and I learn it by trying to write simple app.

I've changed MatrixTranspose code to MatrixRotate and stumble upon strange error which seem like a compiler error. I'm using SDK2.3 and HD5450

If i'm using "globalIdx" insted of "(groupIdx*blockSize + localIdx)" I get the right process, but they should contain the same value (been checked)  so it is very confusing.

Some explanation please.

 

__kernel void matrixRotate(__global float * output, __global float * input, __local float * block, const uint width, const uint height, const uint blockSize ) { uint globalIdx = get_global_id(0); uint globalIdy = get_global_id(1); uint localIdx = get_local_id(0); uint localIdy = get_local_id(1); /* copy from input to local memory */ block[localIdy*blockSize + localIdx] = input[globalIdy*width + globalIdx]; /* wait until the whole block is filled */ barrier(CLK_LOCAL_MEM_FENCE); uint groupIdx = get_group_id(0); uint groupIdy = get_group_id(1); /* calculate the corresponding target location for transpose by inverting x and y values*/ uint m = (height -1); uint targetGlobalIdx = m- (groupIdy*blockSize + localIdy) ; uint targetGlobalIdy = (groupIdx*blockSize + localIdx); //!!!!! using globalIdx solve the problem but WHY !!!! /* calculate the corresponding raster indices of source and target */ uint targetIndex = targetGlobalIdy*width + targetGlobalIdx; uint sourceIndex = localIdy * blockSize + localIdx; output[targetIndex] = block[sourceIndex]; }

0 Likes
5 Replies
pulec
Journeyman III

You mean:
uint targetGlobalIdy = globalIdx;
?

I suppose that you are right, according to the code it should be the same. But from the fact that once the program crashes and once not I think you shouldn't conclude that these are different...
0 Likes

I run it on a square problem where width = height and the problem occure
The program run OK with

uint targetGlobalIdy = globalIdx;

and I've checked  (groupIdx*blockSize + localIdx) and it equal to "globalIdx"

Is those kind of inconsistencies are common at this platform?

 

0 Likes

Originally posted by: amitporat

I run it on a square problem where width = height and the problem occure
The program run OK with


uint targetGlobalIdy = globalIdx;


and I've checked  (groupIdx*blockSize + localIdx) and it equal to "globalIdx"


Is those kind of inconsistencies are common at this platform?


 


Yeah sorry, I noticed that you mean square matrix so I deleted the EDIT note.
I don't now, but if they are really the same (as they should be) than the error has to lay somewhere else. I can't imagine that the card would compute just wrong value.
Anyway, I am afraid I also don't see any error by now Maybe someone else.

EDIT: Another idea is that for the computation of the index the kernel could request more registers than the device has. But that would prevent kernel from run and the framework should catch the error.
Additional questions - have you modified also the cpu code or only the kernel?
Also I suggest to try to find out, what exaclty causes the segfault - it seems to be the kernel call but it can be somewhere else.
0 Likes

No the host code is the same.

the Kernel code is also very similar to "MatrixTranspose".

Strange indeed....

0 Likes

Thanks for reporting the issue. I will report it to the developer team once i confirm it at my end.

0 Likes