4 Replies Latest reply on Nov 8, 2016 4:28 AM by Raistmer

    The usage of restrict modifier in __local memory space

    Raistmer

      Hello.

       

      When I add restrict modifier to pointer from __local memory space no errors generated by compiler but kernel silently starts to produce invalid results.

      But particular memory location accessed only via that pointer (but from different workitems, local memory used in reduction procedure).

      So, what meaning "restrict" keyword has if it used for __local type* restrict  pointer?

      Does such usage

      kernel k(__local type* restrict pp){

      const int tid=get_local_id(0);

      __local type* restrict p=pp+1;

      p[tid]=1;

      barrier(CLK_LOCAL_MEM_FENCE);

      if(tid==0){p[0]+=p[1];}

      barrier(CLK_LOCAL_MEM_FENCE);

      type a=p[0];

      }

      contradict with restrict meaning?

        • Re: The usage of restrict modifier in __local memory space
          rampitec

          Yes, this usage in fact contradicts the restrict meaning. There shall be no aliases if restrict is specified and p is an alias to pp. Compiler shall not issue warnings if an user deliberately used restrict. A tool like lint can detect an error though, but again is not required to. On the practical side compiler may assume there were no writes to pp.

            • Re: The usage of restrict modifier in __local memory space
              Raistmer

              Thanks for answer.

              But what would be if pp was used everywhere, not p in sample above?

              Would there be contradiction with restrict meaning still or not?

               

              Regarding "subsampling" of restrict pointers: Demystifying The Restrict Keyword - CellPerformance

              Restricted pointers can be copied from one to another to create a hierarchy of pointers. However there is one limitation defined in the C99 standard. The child pointer must not be in the same block-level scope as the parent pointer. The result of copying restricted pointers in the same block-level scope is undefined.

              So, in my example above I should not use pp directly after p definition. But seems subsampling per se not an issue. Nevertheless, to avoid these ambiguities lets consider such kernel:

               

              kernel k(__local type* restrict p){

              const int tid=get_local_id(0);

              p[tid]=1;

              barrier(CLK_LOCAL_MEM_FENCE);

              if(tid==0){p[0]+=p[1];}

              barrier(CLK_LOCAL_MEM_FENCE);

              type a=p[0];

              }

               

              That is, no subsampling of restrict p, but still access through p pointer from different workitems (providing it's real kernel, not task).

              The same question - is it legal to use restrict here? (p[1] was written by workitem with tid==1 and then read from tid==0 workitem )