5 Replies Latest reply on Jul 15, 2010 4:15 PM by nou

    proper device number query



      My question would be, what is the proper way of querying the number of devices in an OpenCL program. I found that the sample CLInfo cpp application returns the correct number, however a simple c program returns a lot more devices and dies with seg fault even if queryied for attributes, not to mention using.

      The following two code samples show the cpp and an old c sample way of querying device number. I would like to stick with the c way, since I'd like to keep my code portable and NVIDIA don't have cl.hpp (as far as I know), therefore I'd like to stick with cl.h.

      For some lucky reason the working devices are at the beginning of the devices array in the c code, so if I know how many GPUs are in the machine, I can use all of them, but I'd rather want to write a code that can find out for itself. All ideas are appreciated.


      cl::vector<cl::Device> devices; (*p).getDevices(CL_DEVICE_TYPE_ALL, &devices); std::cout << "Number of devices:\t\t\t\t " << devices.size() << std::endl; for (cl::vector<cl::Device>::iterator i = devices.begin(); i != devices.end(); ++i) { ... } //------------------------------------------------------------------------------------------- cl_device_id* devices; size_t devicenumber; clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, NULL, &devicenumber); devices = (cl_device_id*)malloc(devicenumber); clGetContextInfo(context, CL_CONTEXT_DEVICES, devicenumber, devices, NULL); checkErr( devicenumber != 0 ? CL_SUCCESS : -1, "devicenumber <= 0");

        • proper device number query

          You can query devices directly from platform -


          status = clGetPlatformIDs(1, &platform, 0); cl_uint numDevices = 0; status = clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, 0, 0, (cl_uint*)&numDevices); cl_device_id* devices = (cl_device_id*)malloc(numDevices * sizeof(cl_device_id)); status = clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, numDevices, devices, 0);

            • proper device number query

              What you are probably getting with:

              clGetContextInfo(context, CL_CONTEXT_DEVICES, 0, NULL, &devicenumber);

              is the total size of the the array returned, which is the number_of_elements * sizeof(cl_device_id), so you would need to divide by sizeof(cl_device_id) to get the number of devices.  There's a CL_CONTEXT_NUM_DEVICES option that you can query to get this value more easily.

              Eg. You can query the devices in a context using:

              clGetContextInfo(context, CL_CONTEXT_NUM_DEVICES, sizeof(cl_uint), &numDevices, 0);
              cl_device_id* devices = (cl_device_id*)malloc(numDevices * sizeof(cl_device_id));

              clGetContextInfo(context, CL_CONTEXT_DEVICES, numDevices*sizeof(cl_device_id), &devices, 0);

              or the devices that are associated with a program using:

              clGetProgramInfo(program, CL_PROGRAM_NUM_DEVICES, sizeof(cl_uint), &numDevices, 0);
              cl_device_id* devices = (cl_device_id*)malloc(numDevices * sizeof(cl_device_id));
                clGetProgramInfo(program, CL_PROGRAM_DEVICES, numDevices*sizeof(cl_device_id), &devices, 0);

              or all the devices available on a platform, as shown by n0thing.

                • proper device number query

                  Thanks for the info. The platform query did work. However the following command did not:

                  clGetContextInfo(context, CL_CONTEXT_NUM_DEVICES, sizeof(cl_uint), &numDevices, 0);

                  CL_CONTEXT_NUM_DEVICES not declared in this scope. Are you absolutely sure it works this way?

                    • proper device number query

                      This is the generic method I use to identify all devices accessible in a system. It scans all the available platforms, puts all the devices in a dynamically allocated array, and then prints the results.

                      If you have both nVidia and AMD platforms and devices correctly installed, it will recognize both.

                      int i; cl_uint num_platforms = 0, num_devices = 0, temp_uint, temp_uint2; if (clGetPlatformIDs(0, NULL, &num_platforms) != CL_SUCCESS) printf("Failed to query platform count!\n"); printf("Number of Platforms: %d\n", num_platforms); cl_platform_id * platforms = (cl_platform_id *) malloc(sizeof(cl_platform_id) * num_platforms); if (clGetPlatformIDs(num_platforms, &platforms[0], NULL) != CL_SUCCESS) printf("Failed to get platform IDs\n"); for (i = 0; i < num_platforms; i++) { temp_uint = 0; if(clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, 0, NULL, &temp_uint) != CL_SUCCESS) printf("Failed to query device count on platform %d!\n", i); num_devices += temp_uint; } printf("Number of Devices: %d\n", num_devices); cl_device_id * devices = (cl_device_id *) malloc(sizeof(cl_device_id) * num_devices); temp_uint = 0; for ( i = 0; i < num_platforms; i++) { if(clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, num_devices, &devices[temp_uint], &temp_uint2) != CL_SUCCESS) printf ("Failed to query device IDs on platform %d!\n", i); temp_uint += temp_uint2; temp_uint2 = 0; } //Insert your program here char buff[128]; for (i = 0; i < num_devices; i++) { clGetDeviceInfo(devices[i], CL_DEVICE_NAME, 128, (void *)&buff[0], NULL); printf("Device %d: %s\n", i, buff); } free(devices); free(platforms);