cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

mux85
Journeyman III

fixed size array not working when used inside loop

hi, i have a kernel used to compute a background from a video. in this kernel i use an array to store pixels in the same position of different frames. on cpu this kernel works well but on gpu (hd 5850) it doesn't. doing some try simplifing the kernel to find the source of the problem i found out that when i use this array inside a for loop the data aren't written correctly, outside the loop the data are correct (but quite useless). here is the code. thanks in advance for any help

#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable #define MAX_FRAMES 64 typedef struct { uchar r; uchar g; uchar b; } rgb; void swap(rgb * a, rgb * b) { rgb t=*a; *a=*b; *b=t; } float rgbToLum(rgb pix) { return 0.3f*pix.r+0.59f*pix.g+0.11f*pix.b; } void sort(rgb * v, uint n) { bool swapped=true; while(swapped) { swapped=false; for(uint i=0; i<n-1; i++) if(rgbToLum(v)>rgbToLum(v[i+1])) { swap(&v,&v[i+1]); swapped=true; } n=n-1; } } rgb median(rgb * v, uint n) { sort(v, n); return v[n/2]; } rgb average(rgb m[2][2]) { rgb a; ushort r=0, g=0, b=0; for(uint i=0; i<2; i++) for(uint j=0; j<2; j++) { r += m.r; g += m.g; b += m.b; } a.r = r/4; a.g = g/4; a.b = b/4; return a; } kernel void BackgroundKernel( global read_only rgb * buf_in, global write_only rgb * buf_bg, read_only uint n) { ushort i = get_global_id(0); ushort j = get_global_id(1); ushort h_bg = get_global_size(0); ushort w_bg = get_global_size(1); ushort h_in = h_bg*2; ushort w_in = w_bg*2; uint frame_size = w_in*h_in; uint pos_in = w_in*i*2+j*2; uint pos_bg = w_bg*i+j; //down-scaling of the frames rgb matr[2][2]; rgb temp[MAX_FRAMES]; for(uint k=0; k<n; k++) { matr[0][0] = buf_in[pos_in]; matr[0][1] = buf_in[pos_in+1]; matr[1][0] = buf_in[pos_in+w_in]; matr[1][1] = buf_in[pos_in+w_in+1]; temp = average(matr); pos_in += frame_size; } //median of the frames used as bg buf_bg[pos_bg] = median(temp, n); }

0 Likes
35 Replies
mux85
Journeyman III

i forgot to say that the array the causes the problem is the one called temp in the code i posted

0 Likes
omkaranathan
Adept I

mux85,

Could you post the runtime side code too? A compilable test case would make it easier for us to tackle the problem.

 

0 Likes

i'm using an opencl wrapper for c# (cloo). i am using also other external libraries to capture and write the video frames. if you want i can send the whole project but i'll probably need an email address

0 Likes

You can send it to streamdeveloper@amd.com

0 Likes

I get a Byte Addressable Stores Invalid in SKA.

Can you just vectorize this code? So that instead of operating on one rgb, you are operating on 4 at a time?

0 Likes

Originally posted by: ryta1203 I get a Byte Addressable Stores Invalid in SKA.

 

Can you just vectorize this code? So that instead of operating on one rgb, you are operating on 4 at a time?

 

probably your device does not support that extension. i get no such error in ska

0 Likes

Originally posted by: mux85
Originally posted by: ryta1203 I get a Byte Addressable Stores Invalid in SKA.

 

Can you just vectorize this code? So that instead of operating on one rgb, you are operating on 4 at a time?

 

probably your device does not support that extension. i get no such error in ska

I'm using a HD5870. You are using a 5850, so I'm not sure what the diff would be!?

0 Likes

Originally posted by: ryta1203
Originally posted by: mux85
Originally posted by: ryta1203 I get a Byte Addressable Stores Invalid in SKA.

 

 

 

Can you just vectorize this code? So that instead of operating on one rgb, you are operating on 4 at a time?

 

 

 

 

probably your device does not support that extension. i get no such error in ska

 

 

I'm using a HD5870. You are using a 5850, so I'm not sure what the diff would be!?

 

in this case you shouldn't have any problem with cl_khr_byte_addressable_store as i do not. in fact i don't think it is at the origin of my problem. other kernels with that extension enabled work well

0 Likes

This could be a problem with the SKA or the driver that it uses, I don't know.

I have a 5870 in my machine and am attempting to compile your kernel in SKA for a 5870. I still get the error.

0 Likes

Originally posted by: ryta1203 This could be a problem with the SKA or the driver that it uses, I don't know.

 

I have a 5870 in my machine and am attempting to compile your kernel in SKA for a 5870. I still get the error.

 

don't know what to say. we should wait someone with more exprerience. ska compiles fine with this kernel for me

0 Likes

Which version of SKA are you using? Which driver is one your system?

0 Likes

i'm using the last version of the stream sdk with the ska included in it.

my drivers are the april catalyst.

0 Likes

Originally posted by: mux85 i'm using the last version of the stream sdk with the ska included in it.

my drivers are the april catalyst.

 

mine too, odd.

0 Likes

Originally posted by: mux85 i'm using the last version of the stream sdk with the ska included in it.

my drivers are the april catalyst.

 

mine too, odd.

0 Likes

omkaranathan, have you received my e-mail? thanks

0 Likes

Originally posted by: omkaranathan You can send it to streamdeveloper@amd.com

 

hi, i've sent a mail with the title of this discussion as subject. let me know you have received it or not. thanks

0 Likes

Mux85,

I have received the mail and am looking into the issue.

 

0 Likes

any news? have you been able to build the project and replicate tha issue? thanks

0 Likes

mux85,

Sorry I couldn't give an early reply. Looks like you have used VS2010. I don't have the version at the moment. Could you send binaries(debug) of your application? .

 

0 Likes

ok, i will send you the binaries as soon as possible. and yes i use vs 2010, sorry for the inconvenience

0 Likes

I have sent all the files in the binaries debug folder that i haven't already sent. notice that i have appended an underscore at the end of the exe file names because of gmail attachment policies. thanks

0 Likes

omkaranathan, have you made any progress about this problem?

0 Likes

Originally posted by: mux85 omkaranathan, have you made any progress about this problem?

 

0 Likes

any news? the problem is still there with sdk 2.2

0 Likes

mux85,
Sorry about this. I must have missed this one. I'll look into getting it fixed for the next release. The problem is more than just array, but a combination between byte types, struct and arrays.
0 Likes

ok, I understand. is the code that i sent being useful?

thanks

0 Likes
Tasp
Journeyman III

just for the record:

I confirm this bug, sdk 2.2 hd5850.

This cost me lots of time!

mux85,
Sorry about this. I must have missed this one. I'll look into getting it fixed for the next release.


How about setting up a professional bug tracking site for us customers?

*edit*

Is there a workaround for this?

0 Likes

0 Likes

any news about this issue? i'm using another kernel with a for loop. the result is the same: it seems that the code inside the loop is not executed when the kernel is executed on a gpu. on cpu everything is fine. can someone suggest a workaround for this? (apart from using cpu instead of gpu)

thanks

0 Likes

the problem is still there with Catalyst APP 10.10

0 Likes

this is more likely OpenCL compiler issue than driver. so you can expect it will be fixed in next release of SDK not driver.

0 Likes
mux85
Journeyman III

understood. i thoght that everithing was inside opencl.dll and that the only difference with the new driver was that the dll now comes with the driver instead of being included in the sdk

0 Likes

mux85,
The catalyst 10.10 DLL is the same as the SDK 2.2 DLL. Your issue will be fixed in SDK 2.3.
0 Likes

I'm still having this problem in one of my kernels. The array is not fixed size but is a buffer passed as a parameter with global qualifier. Data doesn't seem to be written to it when using it inside a loop

0 Likes

Sorry in this case it was my fault, the buffer was created using CL_MEM_USE_HOST_PTR instead of CL_MEM_ALLOC_HOST_PTR

0 Likes