2 Replies Latest reply on Aug 18, 2010 3:35 AM by Fuxianjun

    about workitems number

    Fuxianjun

      if wavefront size is 64, I want to add two vectors, both of the two vectors' length are 60,how do i specify the global_work_size ? which global_work_size is better ,(60,1,1) or(64,1,1) ?

        • about workitems number
          genaganna

           

          Originally posted by: Fuxianjun if wavefront size is 64, I want to add two vectors, both of the two vectors' length are 60,how do i specify the global_work_size ? which global_work_size is better ,(60,1,1) or(64,1,1)?

           

          global_work_size can be anything (60, 1, 1) or (64, 1,1) and local workgroup size should be (60, 1, 1) and (64, 1, 1) respectively.

          Case 1 : global_work_size (60, 1, 1) and local_work_size (60, 1, 1)

                        4 threads of wavefront not used for computation but executing

          Case 2 : global_work_size (64, 1, 1) and local_work_size (64, 1, 1)

                        4 threads of wavefront used for dummy computation.

           

                     In both case, you should get same performances.

           

                 For such small computation, GPU is not right choice as transfer time dominates the computation time.

                You should remember atleast following two things when you want to use GPU for computation

                    1.  local_work_size must be multiples of wavefront

                    2. global_work_size must be multiples of (local_work_size * compute units)