AMD OpenCL compiler ignores kernel attribute “work_group_size_hint”

Question asked by eynuel on Jun 29, 2012
Latest reply on Jul 3, 2012 by nou

I'm currently optimizing an OpenCL kernel, and have been trying to find optimal values for workgroup sizes and vector widths.

Currently I'm using an ubuntu system with an Intel i7-3930k (6 cores @ 3.5 GHz, HT disabled) and an AMD HD6870. Both Intel and AMD OpenCL implementations are installed to allow for comparisons. ( APP SDK v2.7 Linux 64b & Catalyst 12.4, Intel OpenCL SDK 1.5 ).


Running on the CPU (on Intel Platform) I've found that:


- By choosing a wg size of 256 I can gain about 13.5% performance in comparison to wgsize=1.

- By specifying `__attribute__((vec_type_hint(float4)))` I can gain a 30% boost.

- By specifying `__attribute__((work_group_size_hint(WG_SIZE, 1, 1)))` I get another ~90%


So, in total, theses options can result in close to a 3x performance increase. Unfortunately, when running this case on the the CPU using the AMD OpenCL platform, I've found that i can only get about 4% performance gain from specifying thw wgSize and the optional attributes are ignored.


Kernel declaration is:



   __attribute__(( work_group_size_hint(WG_SIZE, 1, 1) ))

   __attribute__(( vec_type_hint(VEC_SIZE) ))

    void solveEikonalEq( global      env_packed_t*   env_packed_in,

                                  global      float*          packedEnvData_in,

                                  private     float           ds,

                                  private     float           freq,

                                  global      ray_t*          ray,

                                  global      rayMembers_t*   rayMembers){



And compiler output is:


    "/tmp/", line 2637: warning: unknown attribute "work_group_size_hint"

      kernel  __attribute__((work_group_size_hint(WG_SIZE, 1, 1)))



    "/tmp/", line 2638: warning: unknown attribute "vec_type_hint"

              __attribute__(( vec_type_hint(VEC_SIZE)))



Does AMD always ignore these hints? Or is there something i have to do to enable these attributes on the AMD platform?

From reading the opencl 1.1 spec i had the impression that supporting these kernel attributes was not optional (even if actually acting on them is)