Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Adept II

Multy-device issue

Excessive memory still allocated on first device

I got report form 2 HD5970 owner:
After more than 10 update cycles on device0, device1 is in fact updating, so there is progress, but one task is using large amount of RAM (see below).

Furthermore the update on both is erratic, as it has stopped at 18.918 and 25.225% with about 0,06% CPU utilization for both.

Task 1 (using 625 160K RAM)
Task 2 (using 148 288K RAM)
Because of ATI bug with multicore GPUs that still presents in SDK2.2 I added possibility to disable secondary core on each of 5970 cards.
That is, 2 tasks executed each on its own GPU.

So big difference in host memory clearly says that ~same exessive amount of GPU memory was allocated on first GPU. Workload for this app requires ~same amount of memory for all workunits. So such big difference just impossible if each instance allocates memory only on its own device.

I use this code part to bound context only with particular device:

What iswrong here and why memory still allocated on first GPU always ??

#if USE_OPENCL cl_uint num_devices=0; cl_device_id devices[10]; cl_context_properties cps[3] = { CL_CONTEXT_PLATFORM, (cl_context_properties)platform, 0 }; cl_context_properties* cprops = (NULL == platform) ? NULL : cps; cl_uint num_entries=10;//R:hardly possible that more than 10 GPUs will be in single host #if USE_OPENCL_CPU err=clGetDeviceIDs(platform,CL_DEVICE_TYPE_CPU,num_entries,devices,&num_devices); #else err=clGetDeviceIDs(platform,CL_DEVICE_TYPE_GPU,num_entries,devices,&num_devices); #endif if(err!=CL_SUCCESS)fprintf(stderr,"ERROR: clGetDeviceIDs (second call): %d\n",err); device_id=devices[assigned_device]; context = clCreateContext (cprops,1,&device_id,NULL,NULL,&err); if(err!=CL_SUCCESS)fprintf(stderr,"ERROR: clCreateContext: %d\n",err); cq = clCreateCommandQueue( context, device_id, CL_QUEUE_PROFILING_ENABLE, &err); if(err != CL_SUCCESS){fprintf(stderr,"Creating Command Queue. (clCreateCommandQueue) %d\n",err);fflush(stderr);exit(-1);} #endif

2 Replies

send a testcase to

Journeyman III


 I meet the same problem. In the SDK 2.2 simpleMultiDevice example, memory seems only on the first GPU.