cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Raistmer
Adept II

The usage of restrict modifier in __local memory space

Hello.

When I add restrict modifier to pointer from __local memory space no errors generated by compiler but kernel silently starts to produce invalid results.

But particular memory location accessed only via that pointer (but from different workitems, local memory used in reduction procedure).

So, what meaning "restrict" keyword has if it used for __local type* restrict  pointer?

Does such usage

kernel k(__local type* restrict pp){

const int tid=get_local_id(0);

__local type* restrict p=pp+1;

p[tid]=1;

barrier(CLK_LOCAL_MEM_FENCE);

if(tid==0){p[0]+=p[1];}

barrier(CLK_LOCAL_MEM_FENCE);

type a=p[0];

}

contradict with restrict meaning?

0 Likes
4 Replies
rampitec
Staff

Yes, this usage in fact contradicts the restrict meaning. There shall be no aliases if restrict is specified and p is an alias to pp. Compiler shall not issue warnings if an user deliberately used restrict. A tool like lint can detect an error though, but again is not required to. On the practical side compiler may assume there were no writes to pp.

0 Likes

Thanks for answer.

But what would be if pp was used everywhere, not p in sample above?

Would there be contradiction with restrict meaning still or not?

Regarding "subsampling" of restrict pointers: Demystifying The Restrict Keyword - CellPerformance

Restricted pointers can be copied from one to another to create a hierarchy of pointers. However there is one limitation defined in the C99 standard. The child pointer must not be in the same block-level scope as the parent pointer. The result of copying restricted pointers in the same block-level scope is undefined.

So, in my example above I should not use pp directly after p definition. But seems subsampling per se not an issue. Nevertheless, to avoid these ambiguities lets consider such kernel:

kernel k(__local type* restrict p){

const int tid=get_local_id(0);

p[tid]=1;

barrier(CLK_LOCAL_MEM_FENCE);

if(tid==0){p[0]+=p[1];}

barrier(CLK_LOCAL_MEM_FENCE);

type a=p[0];

}

That is, no subsampling of restrict p, but still access through p pointer from different workitems (providing it's real kernel, not task).

The same question - is it legal to use restrict here? (p[1] was written by workitem with tid==1 and then read from tid==0 workitem )

0 Likes

I see no issues here. Restrict applies to the pointers as defined in the source, it does not carry any additional semantics when multiple threads access the same memory. In fact you have taken care about the latter with the barriers already.

Thanks for answer.

I'll try to reproduce wrong computations with similar kernel and restrict pointer (with addition of separate {} block to be sure it's not subsampling issue) and come back with bug report then.

0 Likes