cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

norop
Journeyman III

Byte addressable storing bug???

Hello,

I want use byte addressable storing. But not working correctly.

I think it's a bug. Is there a plan to fix?

The attached code is to clear the array contents, but not initialized.

If ARRAYSIZE is 1, it works fine.

Using char4 instead of 'char[4]' works fine, but I want to use more complex structure.

 

#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable #define ARRAYSIZE 2 ... struct { char element[4]; } ch4[ARRAYSIZE]; for(int i = 0; i < 2; i++) { ch4.element[0] = 0; ch4.element[1] = 0; ch4.element[2] = 0; ch4.element[3] = 0; }

0 Likes
5 Replies
omkaranathan
Adept I

norop,

Please provide more information.

What is your system configuration?(OS/GPU/Driver)

What is the error message/behavior you are getting?

Post kernel/runtime code which will help to reproduce the issue easier..

0 Likes
norop
Journeyman III

Ok.

I run this program on Windows7 (x64).  GPU is Radeon HD 5870, driver version Catalyst 10.4.

Error message does not appear. But the result is corrupted.

Please check to run the attached code. The host code is based on Template sample.

It seems 'chunk' is once initialized, but revert to UNinitialized when out of scope which is initialization performed.

Is this enough information?

 

Host Code: based on Template sample. The modification points are: 1. line 245 old: CL_DEVICE_TYPE_CPU, new: CL_DEVICE_TYPE_GPU, 2. line 658 old: std::cout << arrayData << " "; new: if(i%4 == 0) std::cout << std::endl; std::cout << ((arrayData & 0x000000ff) >> 0 ) << " " << ((arrayData & 0x0000ff00) >> 8 ) << " " << ((arrayData & 0x00ff0000) >> 16) << " " << ((arrayData & 0xff000000) >> 24) << ", "; Kernel code: #pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable typedef struct { uchar c1; uchar c2; uchar c3; uchar c4; } Data; typedef union { Data dps[42]; } Chunk; __kernel void templateKernel(__global unsigned int * output, __global unsigned int * input, const unsigned int multiplier) { Chunk chunk; if(get_global_id(0) == 0) { Data dp; for(int idx = 0; idx < 21; idx++) { dp.c1 = 5; dp.c2 = 5; dp.c3 = 5; dp.c4 = 5; chunk.dps[idx] = dp; ((__global Data*)output)[idx +0 ] = chunk.dps[idx]; } for(int idx = 0; idx < 21; idx++) { ((__global Data*)output)[idx +32] = chunk.dps[idx]; } } } The result: .... Output: 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 0 1 2 3, 4 5 6 7, 8 9 10 11, 12 13 14 15, 16 17 18 19, 20 21 22 23, 24 25 26 27, 28 29 30 31, 32 33 34 35, 36 37 38 39, 40 41 42 43, 44 45 46 47, 48 49 50 51, 52 53 54 55, 56 57 58 59, 60 61 62 63, 64 65 66 67, 68 69 70 71, 72 73 74 75, 76 77 78 79, 80 81 82 83, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, .... But expected output is: .... Output: 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 5 5 5 5, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, 205 205 205 205, ....

0 Likes

norop,
Thanks for reporting this. I'll see what I can do to make sure this is fixed by the next release.
0 Likes

I found the solution.

This may not be due to byte-addressable-storing, but storing whole structure.

Accessing via menber works fine!

See attached kernel code. I define union which members are original struct and other member (both are 4bytes length).

Thanks!

#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable typedef union { struct { uchar c1; uchar c2; uchar c3; uchar c4; }; uint ui; } Data; typedef union { Data dps[42]; } Chunk; __kernel void templateKernel(__global unsigned int * output, __global unsigned int * input, const unsigned int multiplier) { Chunk chunk; if(get_global_id(0) == 0) { Data dp; for(int idx = 0; idx < 21; idx++) { dp.c1 = 5; dp.c2 = 5; dp.c3 = 5; dp.c4 = 5; chunk.dps[idx].ui = dp.ui; ((__global Data*)output)[idx +0 ].ui = chunk.dps[idx].ui; } for(int idx = 0; idx < 21; idx++) { ((__global Data*)output)[idx +32].ui = chunk.dps[idx].ui; } } }

0 Likes

norop,
This has been fixed internally and should be in the upcoming release.
0 Likes