cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

d_a_a_
Adept II

How do I do to run independent serial OpenCL instances in different CPU cores?

Hi,

I am performing serial single-core CPU benchmarks (as point of reference), by setting the environment variable CPU_MAX_COMPUTE_UNITS to 1. Unfortunately, each execution takes too long (up to 140 times slower than the GPU version!), thus I would like to be able to run simultaneously multiple instances of the serial benchmark on my six-core processor in order to finish the set of benchmarks more quickly.

However, when I try to do this, each independent instance gets assigned to the same CPU core (the first one, that is, CPU0) instead of each one getting assigned to a different core. How do I do to execute multiple instances of the serial execution in separate CPU cores?

I'm using Debian GNU/Linux 64-bit and the AMD SDK v2.2.


Thank you.

 

0 Likes
4 Replies
nou
Exemplar

try set CPU affinity.

man taskset

otherwise maybe device fission.

0 Likes

Originally posted by: nou try set CPU affinity.

man taskset

 

otherwise maybe device fission.

 

Thank you for your reply.

I'm having random success while trying to use taskset. Whenever I start my OpenCL process with taskset, it is correctly assigned to the CPU core I've specified, but then the process automatically migrates to CPU0. Checking the current affinity of the process in execution returns the correct one, as specified initially, so the problem is probably somewhere else. Inside my OpenCL code, before initializing the OpenCL related stuff, I've the following line: 'setenv( "CPU_MAX_COMPUTE_UNITS", "1", 1 );'

Trying to set the affinity after the application has started occasionally works, but I cannot run my set of benchmarks hoping that taskset will work.

0 Likes

IMHO OpenCL set thread affinity so it maybe colide with taskset.

0 Likes

Thank you nou,

I have decided to solve the issue via device fission; it works like a charm. For those interested on the details please see the attached code.

#define USE_CL_DEVICE_FISSION #include <CL/cl.hpp> [...] // Avoid setting a number of cores greater than the amount available size_t cu = std::min( (size_t) desired_number_of_cpu_cores, (size_t) device.getInfo<CL_DEVICE_MAX_COMPUTE_UNITS>() ); // Set the way we want to subdivide the device by creating a list of properties cl_device_partition_property_ext subdevice_properties[] = { CL_DEVICE_PARTITION_BY_COUNTS_EXT, cu, CL_PARTITION_BY_COUNTS_LIST_END_EXT, CL_PROPERTIES_LIST_END_EXT }; // Create a sub device containing the specified number of processors the user wants std::vector<cl::Device> subdevices; device.createSubDevices( subdevice_properties, &subdevices ); // Finally, set 'device' to be the newly created device (subdevices) device = subdevices.front();

0 Likes