If I pass restrict to the params as:
__global float2 *restrict a,
const __global float4 *restrict b
produces incorrect results with Cat.11.3/5750/Win7_64. If I remove the restrict keyword works ok.
__kernel __attribute__((reqd_work_group_size(WORK_WIDTH,WORK_HEIGHT,1))) /* work width=16, work height=8 */ void MyKernel ( __global float2 *a, const __global float4 *b ) { const size_t g1D = get_global_id(1U)*GRID_WIDTH + get_global_id(0U); /* grid width=256U */ if ( b[g1D].x>0.0f ) { a[g1D].x += 1.0f; } } /* note: I use float4 because I assign YZW components for other kernel */
hi bubu,
In one of the posts( I can't find now), it suggsted to use const restrict instead of restrict. Does the flag -fno-alias work for you.
It would be easy for us to reproduce the issue if you can also post the host code.
Originally posted by: himanshu.gautamIn one of the posts( I can't find now), it suggsted to use const restrict instead of restrict. Does the flag -fno-alias work for you.
Well, I'm using const already, unless you mean const __global float4 *const restrict
bubu,
http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=149689&highlight_key=y