cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

landmann
Journeyman III

uchar16 vs. float4

Hi,

again it is a kind of memory transpose kernel I am working on. I realized that when using the uchar16 data type the compiler generates 4 read and 4 write instructions to transfer one element ( dest[idx] = src[idx2] ), whereas declaring the pointers to point to float4 only generates one read and one write instruction to transfer the same amount of data.

What prevents the compiler from doing the same operation for the uchar16 data type?

Thanks!

Joerg

0 Likes
12 Replies