this kernel runs fine with the cpu but with the gpu I get a "Link failed" error when executing clBuildProgram, any ideas why ? Sorry if I'm missing something obvious
ati radeon 4850 , ubuntu 9.10
#define BSWAP_64(x) (((ulong4)(x) <<(ulong4) 56) | \ (((ulong4)(x) <<(ulong4) 40) & (ulong4)0xff000000000000UL) | \ (((ulong4)(x) <<(ulong4) 24) & (ulong4)0xff0000000000UL) | \ (((ulong4)(x) <<(ulong4) 😎 & (ulong4)0xff00000000UL) | \ (((ulong4)(x) >>(ulong4) 😎 & (ulong4)0xff000000UL) | \ (((ulong4)(x) >>(ulong4) 24) & (ulong4)0xff0000UL) | \ (((ulong4)(x) >> (ulong4) 40) & (ulong4) 0xff00UL) | \ ((ulong4)(x) >> (ulong4) 56)) __kernel void DES( __global ulong4 * pInput, __constant ulong * pSubkeys, int nNumElements) { const uint index = get_global_id(0); if(index>=nNumElements) return; ulong4 plain; plain = BSWAP_64(pInput[index]); pInput[index] = plain; }
thanks for the workaround
hi , can you explain how to do a shift with a uint4 without getting the "operation requires two vectors of the same size" error
#define BSWAP_64(x) (((x) << 56) | \ (((x) << 40) & 0xff000000000000UL) | \ (((x) << 24) & 0xff0000000000UL) | \ (((x) << 😎 & 0xff00000000UL) | \ (((x) >> 😎 & 0xff000000UL) | \ (((x) >> 24) & 0xff0000UL) | \ (((x) >> 40) & 0xff00UL) | \ ((x) >> 56)) __kernel void DES( __global ulong4 * pInput, __constant ulong * pSubkeys, int nNumElements) { const uint index = get_global_id(0); if(index>=nNumElements) return; ulong4 plain; plain.x = BSWAP_64(pInput[index].x); plain.y = BSWAP_64(pInput[index].y); plain.z = BSWAP_64(pInput[index].z); plain.w = BSWAP_64(pInput[index].w); pInput[index] = plain; }
thanks, I already did it that way , my last question was out of curiosity , because as I complete the kernel I will have more functions that will work with shifts. I already did a scalar version of the complete kernel that works fine. but anyway thanks a lot for your support