Hi, I'm developing an opencl image progessing application.
With opencl 1.0 i was passing the image to my kernel as an array of byte (host-side) and using it inside the kernel as an array of structure defined by me (3 uchar elements).
Now i've update to stream sdk 2.2 and i wanted to use one of the new opencl 1.1 features, a 3-component vector type. i relplaced my structure with uchar3. now the entire program crashes.
i think that it might depend on the fact that uchar3 is memorized as 4 bytes. in the array i pass to the kernel every 3 bytes represent a pixel (rgb) and so i think the 1 byte in 4 is "lost" in the 4-th byte of the uchar3. the crash is probably causad by the fact that after a certain number of pixels the program access some memory it shouldn't (it reads 4 bytes for every pixel insted of 3 so it goes out of the array boundaries at some point).
is there a way to solve this problem? is it better to go back to my 3 byte structure (performance-wise)?
Originally posted by: mux85is there a way to solve this problem? is it better to go back to my 3 byte structure (performance-wise)?
You need to arrange the data as per the requirement. I can say one adventage of using 3-component vector is you can use almost all operators and functions similar to other vectors.
so i should put a filler byte every 3 bytes? that's what you are suggesting?
couldn't i use vload3 function instead? which impact on performance will i get if use vload3 for every access insteas of using the array normally (with [])?
thanks
Originally posted by: mux85 so i should put a filler byte every 3 bytes? that's what you are suggesting?
Yes that is what i am suggesting.
couldn't i use vload3 function instead? which impact on performance will i get if use vload3 for every access insteas of using the array normally (with [])?
performance will be improved if you use vload3. Performance of vload3 and vload4 is same becasue both access 4- component aligned address.