cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jprice
Journeyman III

Question about using APUs

Using both GPU and CPU components

I'm using a system that gives the following platform information:

Name      : AMD Accelerated Parallel Processing
Vendor    : Advanced Micro Devices, Inc.
Version   : OpenCL 1.1 AMD-APP-SDK-v2.5 (684.213)
Device #1 : BeaverCreek (Advanced Micro Devices, Inc.) [CL_DEVICE_TYPE GPU]
Device #2 : AMD A8-3850 APU with Radeon(tm) HD Graphics (AuthenticAMD) [CL_DEVICE_TYPE_CPU]

I'm able to use either of these two devices individually without a problem. However, if I try and use both together it fails (kernel doesn't execute at all on GPU, no error).

Is there something I need to do to be able to fully utilise the APU, or am I misunderstanding something about how these systems work?

Thanks.

0 Likes
6 Replies
genaganna
Journeyman III

Originally posted by: jprice I'm using a system that gives the following platform information:

Name      : AMD Accelerated Parallel Processing Vendor    : Advanced Micro Devices, Inc. Version   : OpenCL 1.1 AMD-APP-SDK-v2.5 (684.213) Device #1 : BeaverCreek (Advanced Micro Devices, Inc.) [CL_DEVICE_TYPE GPU] Device #2 : AMD A8-3850 APU with Radeon(tm) HD Graphics (AuthenticAMD) [CL_DEVICE_TYPE_CPU]

I'm able to use either of these two devices individually without a problem. However, if I try and use both together it fails (kernel doesn't execute at all on GPU, no error).

Is there something I need to do to be able to fully utilise the APU, or am I misunderstanding something about how these systems work?

Thanks.

Could you please run SimpleMultiDevice sample and paste output here?

SimpleMultiDevice is shipped with SDK samples.

Can you explain work flow of running on two devices?

0 Likes

$ ./SimpleMultiDevice
----------------------------------------------------------
CPU + GPU Test 1 : Single context Single Thread
----------------------------------------------------------
Total time : 72
Time of CPU : 68.8067
Time of GPU : 4.40075
----------------------------------------------------------
CPU + GPU Test 2 : Multiple context Single Thread
----------------------------------------------------------
Total time : 72
Time of CPU : 71.3133
Time of GPU : 4.40083
----------------------------------------------------------
CPU + GPU Test 3 : Multiple context Multiple Thread
----------------------------------------------------------
Total time : 73
Time of CPU : 72.2002
Time of GPU : 4.40078

 

I'm using a seperate context for each device, enqueuing work into both queues from a single host thread.

Actually, it looks like the kernel is running on the GPU, but I'm getting back the same start & end times when querying with clGetEventProfilingInfo (making it appear like the kernel is taking no time to run). I've seen this before with SDK v2.5 whilst running on another ATI card, is this a bug you're aware of?

Thanks.

0 Likes

Originally posted by: jprice $ ./SimpleMultiDevice ---------------------------------------------------------- CPU + GPU Test 1 : Single context Single Thread ---------------------------------------------------------- Total time : 72 Time of CPU : 68.8067 Time of GPU : 4.40075 ---------------------------------------------------------- CPU + GPU Test 2 : Multiple context Single Thread ---------------------------------------------------------- Total time : 72 Time of CPU : 71.3133 Time of GPU : 4.40083 ---------------------------------------------------------- CPU + GPU Test 3 : Multiple context Multiple Thread ---------------------------------------------------------- Total time : 73 Time of CPU : 72.2002 Time of GPU : 4.40078

 

I'm using a seperate context for each device, enqueuing work into both queues from a single host thread.

Actually, it looks like the kernel is running on the GPU, but I'm getting back the same start & end times when querying with clGetEventProfilingInfo (making it appear like the kernel is taking no time to run). I've seen this before with SDK v2.5 whilst running on another ATI card, is this a bug you're aware of?

Thanks.


Can you run SimpleMultiDevice sample with -e option?  SimpleMultiDevice also using clGetEventProfilingInfo. I am not aware of such issues. Looks like some thing going wrong in code.  Could you please paste your code here?

0 Likes

I'd forgotten that this system has v2.3 installed alongside v2.5. The previous output for SimpleMultiDevice would have been using v2.3, my code also works fine with v2.3.

 

Here's the output I get when I run against v2.5, using -e:

$ ./SimpleMultiDevice -e
----------------------------------------------------------
CPU + GPU Test 1 : Single context Single Thread
----------------------------------------------------------
Total time : 70
Time of CPU : 69.5631
Time of GPU : 0.000471
Verifying results for CPU : Passed!

Verifying results for GPU : Passed!

----------------------------------------------------------
CPU + GPU Test 2 : Multiple context Single Thread
----------------------------------------------------------
Total time : 70
Time of CPU : 68.5048
Time of GPU : 0.000217
Verifying results for CPU : Passed!

Verifying results for GPU : Passed!

----------------------------------------------------------
CPU + GPU Test 3 : Multiple context Multiple Thread
----------------------------------------------------------
Total time : 69
Time of CPU : 68.1591
Time of GPU : 0.000234
Verifying results for CPU : Passed!

Verifying results for GPU : Passed!



PASSED!

So it has the same problem as my code - always getting next to no difference from start to end times. If I remember rightly, upgrading the driver fixed this issue with v2.5, but that's not an option here since I don't have root access.

I'll just use v2.3 of the SDK for now.

0 Likes

Originally posted by: jprice I'd forgotten that this system has v2.3 installed alongside v2.5. The previous output for SimpleMultiDevice would have been using v2.3, my code also works fine with v2.3.

So it has the same problem as my code - always getting next to no difference from start to end times. If I remember rightly, upgrading the driver fixed this issue with v2.5, but that's not an option here since I don't have root access.

I'll just use v2.3 of the SDK for now.

Could be a issue with profiling. Sample is passing. It is recommened to use latest driver with latest SDK.

0 Likes

Yes it looks like my kernels are actually running fine, its just the profiler is returning incorrect stats.

Last time I tried the latest driver performance dropped by 10-15% because of the GPU_USE_SYNC_OBJECTS bug, but I'll try again on my own box soon to see if its fixed/worth it.

Thanks.

0 Likes