1 Reply Latest reply on Mar 19, 2011 12:43 PM by nou

    async_work_group_copy error: CL_INVALID_WORK_GROUP_SIZE



      I'm trying to speed up openCL kernel by using __local memory. Until I used  async_work_group_copy, I was able to set global_work_size to 128 and local_work size to 64. When I used async_work_group_copy call, I couldn't set local_work_size to 64 and error code from clEnqueueNDRangeKernel returned was CL_INVALID_WORK_GROUP_SIZE. The following code demonstrate this.

      My device ATI HD 4500:


      CL_DEVICE_MAX_WORK_ITEM_SIZES: 128 128 128

      I couldn't found in documentation why should this error occured. Am I missing something, or doing something wrong?



      size_t* globalThreads = new size_t[1]; globalThreads[0] = 128; size_t* localThreads = new size_t[1]; localThreads[0] = 64; //64 -> error, but 32 ->ok cl_int status = clEnqueueNDRangeKernel(commandQueue, kernel, 1, NULL, globalThreads, localThreads, 0, NULL, &eventSyncKernel);