dominik_g

Weird profiling information on Fusion GPU

Discussion created by dominik_g on Nov 11, 2011
Profiling information collected from Fusion GPU seems incorrect

Hi everyone,

I've been doing some performance measurements on a Fusion GPU using OpenCL events. The chip is an A8-3850 APU with Catalyst 11.10 and APP SDK 2.5-RC2 on OpenSUSE Linux.

I collect all 4 profiling data (queued, submit, start, end) for a single OpenCL kernel with different input sizes. "queued-to-submit" takes about 60-80 microseconds in all cases (which seems normal). However, "submit-to-start" takes longer for larger inputs and is always more or less equal to "start-to-end" (between 0.3 and 4.2 seconds).

I did exactly the same experiment on a machine with an Radeon HD5970 (same software setup). This time "submit-to-start" only takes around 500 microseconds for all inputs and "start-to-end" grows from 0.2 to 2.1 seconds.

The results on the Radeon GPU make sense, but the profiling on the APU seems broken... Or has anyone got another explanantion?

Dominik

Outcomes