cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

fmilano
Journeyman III

Using less than all the cores in OpenCL

Question about using less than all the available cores.

Hi,

I'm trying out the beta4 release of OpenCL and I was wondering if it is possible to use less than all the available cores in a CPU device. In my particular case I'm using a quad core, but I want my application to use just two cores. Is this possible? I haven't found a configuration option that allows me to do this in the OpenCL standard.

I would really appreciate if you have any example or suggestion.

Thanks in advance,

Federico

0 Likes
8 Replies
nou
Exemplar

try set enviroment variable CPU_MAX_COMPUTE_UNITS to number of cores you want use.

0 Likes

Thanks a lot nou! That works ok. Anyway, is there any way to do this programmatically with the OpenCL API, without changing environment variables?

Thanks again!

Federico

0 Likes

Originally posted by: fmilano Thanks a lot nou! That works ok. Anyway, is there any way to do this programmatically with the OpenCL API, without changing environment variables?

 

Thanks again!

 

Federico

 

Currently there is no way to set number of cores from OpenCL. It is expected to be released in future.

0 Likes

Currently there is no way to set number of cores from OpenCL. It is expected to be released in future.

 

Are you sure about this?  I was under the assumption that by setting the global_work_size in clEnqueueNDRangeKernel, then that would correspond to the number of cores used.  If local_work_size is 1, and global_work_size is 4, this would correspond to utilizing all 4 cores.  Can anyone confirm if this is correct or not?

 

Maybe I am mistaken though, because I have been noticing some work I have been doing hasn't been scaling well as I increase global_work_size.

0 Likes

Originally posted by: RyFo18Are you sure about this? 

I am sure that there is no way to set number of cores from OpenCL

I was under the assumption that by setting the global_work_size in clEnqueueNDRangeKernel, then that would correspond to the number of cores used.  If local_work_size is 1, and global_work_size is 4, this would correspond to utilizing all 4 cores.  Can anyone confirm if this is correct or not?

Maybe I am mistaken though, because I have been noticing some work I have been doing hasn't been scaling well as I increase global_work_size.

This is not ture.

0 Likes

See this slide : http://img3.imageshack.us/img3/1153/openclarchitecture.jpg

According to it a work-group corresponds to a hardware thread on CPU. Hence 4 work-groups in your example should map to 4 cores.

0 Likes

n0thing,
The mapping between a hardware thread and a work-group is not a 1-1 mapping. A single hardware thread can run all the work-groups or the work groups can be split up among the hardware threads. The slide you are referencing is not a completely accurate portrayal of our implementation but an abstracted view of it.
0 Likes

Thanks for the info.

0 Likes