cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

chesik
Staff

AMD APP Profiler 2.5 is now available

We are pleased to announce the availability of AMD APP Profiler v2.5.  For more information, please visit the product page at: http://developer.amd.com/TOOLS/AMDAPPPROFILER/Pages/default.aspx

New features and updates in this version include:

  • Support for AMD APP SDK v2.7.
  • Support for OpenCL™ 1.2.
  • Support for collecting performance counters on APU devices.
  • Full support for profiling with AMD Radeon™ HD7000 series GPUs based on Graphics Core Next/Southern Islands:
    • Added support for kernel occupancy analysis.
    • Added support for collecting performance counters for DirectCompute (DirectX 11) applications.
    • Addition of SALUBusy counter.
    • Fixed value reported for VALUBusy counter.
    • The values reported for LDSFetchInsts and LDSWriteInsts counters were inaccurate on AMD Radeon™ HD7000 series GPUs; thus for those GPUs, those two counters have been replaced by a single LDSInsts counter.
    • Fixed display of kernel ISA.
  • Improved OpenCL™ analysis module:
    • Added detection of deprecated OpenCL™ APIs.
  • Added support for showing source and destination location, as well as zero-copy status for memory transfers initiated using clEnqueueMapBuffer or clEnqueueMapImage. This information is displayed in the API Trace view.
  • Added support for Microsoft® Visual Studio® projects that use User-defined Macros in the project settings.
  • Fixed the --workingdirectory (-w) command line switch (set current directory) on Linux.
  • Fixed some problems with importing previously-generated profile results into Microsoft® Visual Studio®.
  • Changed the default installation directory on Windows to %PROGRAMFILES(X86)%\AMD\AMD APP Profiler to make it more consistent with other AMD tools (i.e gDEBugger and CodeAnalyst)

Please post any feedback here.

0 Likes
17 Replies
rwelsch
Adept I

Hey,

thanks for this version. The occupancy calculator seems to work under linux now (at least it gives some numbers), but there are still two issues:

btw. I'm using Linux

Catalyst 12.4

sprofile - V2.5.1804

AMDAPP SDK v2.7

HD 7970

1. The generation of the occupancy display page does not work (or I'm missing an option I have to use...):

I start it, as given in the help, with

sprofile -P "/path/to/occupancy/params/file.txt" -o "path/to/output.html"

and I get the error:

Error generating occupancy display file.

Unable to copy required files to output directory

The permissions for writing etc. are set correctly. I can see that there are some *.js files created.

My file.txt looks like

.ThreadID=17469

CallIndex=67

KernelName=pot_kernel

DeviceName=Tahiti

ComputeUnits=32

MaxWavesPerComputeUnit=40

MaxWorkGroupPerComputeUnit=40

MaxVGPRs=256

MaxSGPRs=512

MaxLDS=32768

UsedVGPRs=25

UsedSGPRs=80

UsedLDS=1536

WavefrontSize=64

WorkGroupSize=64

WavesPerWorkGroup=1

MaxWorkGroupSize=256

MaxWavesPerWorkGroup=4

GlobalWorkSize=81920

MaxGlobalWorkSize=16777216

WavesLimitedByVGPR=40

WavesLimitedBySGPR=24

WavesLimitedByLDS=21

WavesLimitedByWorkgroup=40

Occupancy=52.5

I got the numbers from the *.occupancy file. The only thing I added is the "call Index" that is not present there. (Is there any real meaning behind this number?)

2. The profiler says I have 32kB of local memory. Also the device query say I have 32kB of local memory. But as I'm using a HD 7970 I'm expecting 64kB of local memory. Is it a driver problem or a display problem?

0 Likes

Ralph Welsch wrote:

2. The profiler says I have 32kB of local memory. Also the device query say I have 32kB of local memory. But as I'm using a HD 7970 I'm expecting 64kB of local memory. Is it a driver problem or a display problem?

I think I got the problem. As the new Programming guide explains, there is a limit of 32 kB of LDS for each wave-front and probably that is the number displayed. But in fact the card has 64 kB of LDS/CU available and so a kernel consisting of several wave-fronts (at least two) can take full account of the 64 kB and this is not considered in the new version of the profiler, right?

So I can multiply the "WavesLimitedByLDS" by two and get the correct result if I have multiple wave-fronts per CU.

0 Likes

Hi all,

I also got the same problem on this :

sprofile -P "/path/to/occupancy/params/file.txt" -o "path/to/output.html"

same error message as rwelsch, any idea ?

My environment :

Ubuntu 10.04

GPU : AMD HD 7750

GPU Driver :  AMD Catalyst 12.6 Proprietary Linux x86 Display Driver  (amd-driver-installer-12-6-x86.x86_64.run)

CPU : Intel(R) Core(TM) i5 CPU 760  @ 2.80GHz

AMDAPP SDK v2.7

0 Likes

Thanks for reporting this, this is a bug in the APP Profiler, the next version of APP Profiler will have the fix. For now, you can only use Windows version of APP Profiler to generate occupancy charts.

Thanks !

By the way, can Visual Studio generate occupancy charts according to ".atp" or ".csv" ?

Or I had to manually add something in ".atp" or ".csv" in order to do that ?

0 Likes

HI , I am a new learner to AMD APP Profile and I met a problem.

how can I creat file.txt while follow the command line:

sprofile -P "/path/to/occupancy/params/file.txt" -o "path/to/output.html"

thaks!

0 Likes

APP Profiler is superseded by CodeXL. Please download CodeXL which includes the same command tool sprofile with bug fixes that should address your problem.

0 Likes
Raistmer
Adept II

Driver limitation on SDK 2.7 doesn't allow me to switch to it but I took APP Profiler from it, v2.5

I would like to share some observations of 2.4 -> 2.5 APP Profiler change.

1. Bug with ICC-compiled binaries trace is fixed for VS plugin, thanks a lot!

2. Bug with loading old binary at race start is fixed too.

3. 2.5 can't load 2.4 created trace3 session fully - it can't show occupancy data for kernels. i checked, newly generated sessions show

that data ok, but it doesn't for old sessions created with v2.4.

4. Maybe the reason in change for device parameters: v2.4 said Loveland C-60 has wavefrong of 32, v2.5 says it has wavefrong of 64.

Who is right who is wrong?

Max number of wavefronts per CU changed acordingly: v2.4 said 64 wavefronts per CU is max, v2.5 says 32 wavefronts per CU is max.

Again, who is right in this?

At least number of workgroups, max workgroup size and CU number remained the same ;).

Driver used: 11.12 mobile version (it's C-60 based netbook).

0 Likes

Hi Raistmer

Thanks for using APP Profiler.

3. We've added a few new fields to occupancy output file which caused the incompatibility. Sorry for the inconvenience.

4. Loveland has wave size of 64. Max wave per CU is 49.

0 Likes

Thanks for info.

1. Bug with ICC-compiled binaries trace is fixed for VS plugin, thanks a lot!

Unfortunately, I was too optimistic in that. Actually bug still here, but I unintentionally found workaround it appears.

What bug: if ICC project is startup one APP Profiler plugin for VS2008 can't start API trace session. In this APP Profiler version too.

And what workaround: Luckely I have solution with few projects inside and one of the projects is MS VC based one, not ICC one. So when MS VC project is startup one trace session dialog opens OK and then one can chose ICC compiled binary of different project instead of binary of startup project. This bug is clearly plugin-based and has nothing to do with profiler itself, but adds some inconvience (and w/o such workaround it makes impossible to run trace session for ICC project in VS). Hope it will be eliminated in next release.

0 Likes
sherifbadr
Journeyman III

i have AMD Sapphire HD 7950 OC when i open Amd SADK APP V 2.7  it never recognize my Kernel .

Am i install AMD APP Profiler v2.5 To Compile my kernel or any solutions

i looked in its option i found that the latest card is 6970 i never found the 7000 series or my kernel name "Tahiti"

i have amd catalyst suit v12.4

HELP me plz with the right steps

i want to render on my gpu

0 Likes

Sorry, but who do not recognize your kernel ?

Profiler is used to profiling your OpenCL program on AMD platform, its optional package, and it is included in APP SDK.

Try newer GPU driver again, e.g. 12.6 or 12.7

0 Likes
chevydevil
Adept II

Hello, using the profiler for GPU performance counters doens't work for me with Catalyst 12.10 APP SDK 2.7 Ubuntu 12.04 x64 and a Radeon 7870 or Radeon 5870. The command line profile yust quits without any message when reaching the kernel execution and when I use CodeXL the only error message is: "Failed to generate profile result".

I tried different kernels. CL_QUEUE_PROFILING_ENABLE was also set when creating the command queues.

Any suggestions?

Update: I tried some of the samples. The SimpleGL example could be profiled. Nbody and Fluid2D not.

0 Likes

Hi Chevydevil

Could you please try our latest developer tool CodeXL? We've fixed a few issues in the GPU profiling back end related to Catalyst compatibility. CodeXL also provides a nice GUI for the GPU profiling on Linux.

0 Likes

As I wrote above, I tried CodeXL already. It is not working and gives the error message above.

0 Likes

With the 12.11 beta driver, I get some relsults when using CodeXL. The performance counter stops after two or three iterations though. But better then nothing!

0 Likes
tomjackson
Journeyman III

Could you please try our latest developer tool CodeXL? We've fixed a few issues in the GPU profiling back end related to Catalyst compatibility. CodeXL also provides a nice GUI for the GPU profiling on Linux. Now it is easy to use Linux for all purposes.

-------------------------

[url=http://www.hotbunshair.com]hotbunshair.com[/url]"

0 Likes