Hello!
In my kernel I am using constant array of bytes:
__constant uchar myconst[256] = { 0x5a, 0xe6, 0x61, 0xf4, 0x31, 0xe3, 0x85 ....................skipped values here...............};
But when I look at generated IL Assembly code, it seems that 8-bit array values are packed to 128-bit constant registers
...
dcl_cb cb2[16] <=here is my array
...
mov r1011, cb2[r1007.x]
cmov_logical r1011.x___, r1008.y, r1011.y, r1011.x <=unpacking row value from 128-bit string from here
cmov_logical r1011.x___, r1008.z, r1011.z, r1011.x
cmov_logical r1011.x___, r1008.w, r1011.w, r1011.x
iand r1006.x___, r1010.x, l15
iadd r1006, r1006.x, l16 ieq r1008, r1006, l12
ishr r1011, r1011.x, l17
cmov_logical r1011.x___, r1008.y, r1011.y, r1011.x
cmov_logical r1011.x___, r1008.z, r1011.z, r1011.x
cmov_logical r1011.x___, r1008.w, r1011.w, r1011.x
ishl r1011.x___, r1011.x, l18
ushr r1011.x___, r1011.x, l18 mov r65._y__, r1011.x
Is there any way to prevent such behavior and store only one array element per constant register? The maximum speed is my goal.