atata

maxworkitems for clEnqueueNDRangeKernel(...)

Discussion created by atata on May 29, 2011
Latest reply on Jun 3, 2011 by laobrasuca
error when setting large number

Hi everyone.

I recently started learning OpenCL and first of all I tryed to modify an example program 'Template' from OpenCL examples package: that program was mulitplying a vector by a number and I wanted to find a linear combination of 2 vectors: b*x + y, where x and y are (complex) vectors and b is a real number. It doesnt really matter the vectors are complex; I just enter "width" and make calculations with vectors with lenght = 2*width, assuming first width components represent real part and last width components represent imaginary part.

  In 'Template' source file there were many different checks (if memory is allocated correctly etc) including the code attached. As I understand, globalThreads[0] I am passing to the clEnqueueNDRangeKernel(...) is a number of work items (threads) I want to run, but what for is that check followed by  clEnqueueNDRangeKernel(...)? According to that check, if I am trying to run a number of threads greater then maxWorkItemSizes then program terminates, but that makes no sense for me. Moreover, if I check the value of maxWorkItemSizes[0] then its equal to 256 (and maxWorkGroupSize is also equal to 256), so that means I can't run more then 256 threads? If I comment that check and run clEnqueueNDRangeKernel(...) with globalThreads > 256 then I get BSOD or some "videodriver was broken and restored or smth" Windows message and Visual Studio closes. I just want to run my program with some adequate number or threads (work items) but I can't understand what's going wrong here.  The 5-th argument of clEnqueueNDRangeKernel(...) is a number of work items I want to run, right? What's that check followed by it then? I didnt attach all the code, but I can if neccessary (as I said before, most part of the code consists of different checks, I didnt really change much in the algorithm). In 'Template' example there was some number like 64 for GlobalThreads[0] before I started modifying it.

I am using Win7 x64, MS VS 2010, gpu radeon5870 hd mobility (its the same as desktop 5770 with lowered frequencies). I installed last version of SDK and 11.4 drivers version (I had 11.5 before, but reinstalled 11.4 because there is no info about adequate support of 11.4 for current sdk version).

Thanks in advance.

 

 

 

 

 

size_t globalThreads[1]; size_t localThreads[1]; size_t maxWorkGroupSize; size_t maxWorkItemSizes[3]; /** * Query device capabilities. Maximum * work item dimensions and the maximum * work item sizes */ status = clGetDeviceInfo( devices[0], CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(size_t), (void*)&maxWorkGroupSize, NULL); if(status != CL_SUCCESS) { std::cout<<"Error: Getting Device Info. (clGetDeviceInfo)\n"; getchar(); return 1; } status = clGetDeviceInfo( devices[0], CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, sizeof(cl_uint), (void*)&maxDims, NULL); if(status != CL_SUCCESS) { std::cout<<"Error: Getting Device Info. (clGetDeviceInfo)\n"; getchar(); return 1; } status = clGetDeviceInfo( devices[0], CL_DEVICE_MAX_WORK_ITEM_SIZES, sizeof(size_t)*maxDims, (void*)maxWorkItemSizes, NULL); if(status != CL_SUCCESS) { std::cout<<"Error: Getting Device Info. (clGetDeviceInfo)\n"; getchar(); return 1; } //those 2 numbers are chosen by user globalThreads[0] = 256; LocalThreads[0] = 256; if(globalThreads[0] > maxWorkItemSizes[0] || localThreads[0] > maxWorkGroupSize) { std::cout<<"Unsupported: Device does not support requested number of work items."; return 1; } // some code setting kernel arguments status = clEnqueueNDRangeKernel( commandQueue, kernel, 1, NULL, globalThreads, localThreads, 0, NULL, &events[0]); if(status != CL_SUCCESS) { std::cout<< "Error: Enqueueing kernel onto command queue. \ (clEnqueueNDRangeKernel)\n"; }

Outcomes