17 Replies Latest reply on Apr 10, 2013 3:00 PM by lbin

    AMD APP Profiler 2.5 is now available

    chesik

      We are pleased to announce the availability of AMD APP Profiler v2.5.  For more information, please visit the product page at: http://developer.amd.com/TOOLS/AMDAPPPROFILER/Pages/default.aspx

       

      New features and updates in this version include:

       

      • Support for AMD APP SDK v2.7.
      • Support for OpenCL™ 1.2.
      • Support for collecting performance counters on APU devices.
      • Full support for profiling with AMD Radeon™ HD7000 series GPUs based on Graphics Core Next/Southern Islands:
        • Added support for kernel occupancy analysis.
        • Added support for collecting performance counters for DirectCompute (DirectX 11) applications.
        • Addition of SALUBusy counter.
        • Fixed value reported for VALUBusy counter.
        • The values reported for LDSFetchInsts and LDSWriteInsts counters were inaccurate on AMD Radeon™ HD7000 series GPUs; thus for those GPUs, those two counters have been replaced by a single LDSInsts counter.
        • Fixed display of kernel ISA.
      • Improved OpenCL™ analysis module:
        • Added detection of deprecated OpenCL™ APIs.
      • Added support for showing source and destination location, as well as zero-copy status for memory transfers initiated using clEnqueueMapBuffer or clEnqueueMapImage. This information is displayed in the API Trace view.
      • Added support for Microsoft® Visual Studio® projects that use User-defined Macros in the project settings.
      • Fixed the --workingdirectory (-w) command line switch (set current directory) on Linux.
      • Fixed some problems with importing previously-generated profile results into Microsoft® Visual Studio®.
      • Changed the default installation directory on Windows to %PROGRAMFILES(X86)%\AMD\AMD APP Profiler to make it more consistent with other AMD tools (i.e gDEBugger and CodeAnalyst)

       

      Please post any feedback here.

        • Re: AMD APP Profiler 2.5 is now available
          rwelsch

          Hey,

           

          thanks for this version. The occupancy calculator seems to work under linux now (at least it gives some numbers), but there are still two issues:

           

          btw. I'm using Linux

          Catalyst 12.4

          sprofile - V2.5.1804

          AMDAPP SDK v2.7

          HD 7970

           

          1. The generation of the occupancy display page does not work (or I'm missing an option I have to use...):

          I start it, as given in the help, with

          sprofile -P "/path/to/occupancy/params/file.txt" -o "path/to/output.html"

          and I get the error:

           

          Error generating occupancy display file.

          Unable to copy required files to output directory

           

          The permissions for writing etc. are set correctly. I can see that there are some *.js files created.

           

          My file.txt looks like

          .ThreadID=17469

          CallIndex=67

          KernelName=pot_kernel

          DeviceName=Tahiti

          ComputeUnits=32

          MaxWavesPerComputeUnit=40

          MaxWorkGroupPerComputeUnit=40

          MaxVGPRs=256

          MaxSGPRs=512

          MaxLDS=32768

          UsedVGPRs=25

          UsedSGPRs=80

          UsedLDS=1536

          WavefrontSize=64

          WorkGroupSize=64

          WavesPerWorkGroup=1

          MaxWorkGroupSize=256

          MaxWavesPerWorkGroup=4

          GlobalWorkSize=81920

          MaxGlobalWorkSize=16777216

          WavesLimitedByVGPR=40

          WavesLimitedBySGPR=24

          WavesLimitedByLDS=21

          WavesLimitedByWorkgroup=40

          Occupancy=52.5

          I got the numbers from the *.occupancy file. The only thing I added is the "call Index" that is not present there. (Is there any real meaning behind this number?)

           

          2. The profiler says I have 32kB of local memory. Also the device query say I have 32kB of local memory. But as I'm using a HD 7970 I'm expecting 64kB of local memory. Is it a driver problem or a display problem?

            • Re: AMD APP Profiler 2.5 is now available
              rwelsch

              Ralph Welsch wrote:

               

              2. The profiler says I have 32kB of local memory. Also the device query say I have 32kB of local memory. But as I'm using a HD 7970 I'm expecting 64kB of local memory. Is it a driver problem or a display problem?

               

              I think I got the problem. As the new Programming guide explains, there is a limit of 32 kB of LDS for each wave-front and probably that is the number displayed. But in fact the card has 64 kB of LDS/CU available and so a kernel consisting of several wave-fronts (at least two) can take full account of the 64 kB and this is not considered in the new version of the profiler, right?

              So I can multiply the "WavesLimitedByLDS" by two and get the correct result if I have multiple wave-fronts per CU.

              • Re: AMD APP Profiler 2.5 is now available
                cycheng-isgd

                Hi all,

                 

                I also got the same problem on this :

                sprofile -P "/path/to/occupancy/params/file.txt" -o "path/to/output.html"

                 

                same error message as rwelsch, any idea ?

                 

                My environment :

                Ubuntu 10.04

                GPU : AMD HD 7750

                GPU Driver :  AMD Catalyst 12.6 Proprietary Linux x86 Display Driver  (amd-driver-installer-12-6-x86.x86_64.run)

                CPU : Intel(R) Core(TM) i5 CPU 760  @ 2.80GHz

                AMDAPP SDK v2.7

                • Re: AMD APP Profiler 2.5 is now available
                  lwh1990

                  HI , I am a new learner to AMD APP Profile and I met a problem.

                   

                  how can I creat file.txt while follow the command line:

                  sprofile -P "/path/to/occupancy/params/file.txt" -o "path/to/output.html"

                   

                   

                   

                  thaks!

                • Re: AMD APP Profiler 2.5 is now available
                  Raistmer

                  Driver limitation on SDK 2.7 doesn't allow me to switch to it but I took APP Profiler from it, v2.5

                  I would like to share some observations of 2.4 -> 2.5 APP Profiler change.

                   

                   

                  1. Bug with ICC-compiled binaries trace is fixed for VS plugin, thanks a lot!

                  2. Bug with loading old binary at race start is fixed too.

                  3. 2.5 can't load 2.4 created trace3 session fully - it can't show occupancy data for kernels. i checked, newly generated sessions show

                  that data ok, but it doesn't for old sessions created with v2.4.

                  4. Maybe the reason in change for device parameters: v2.4 said Loveland C-60 has wavefrong of 32, v2.5 says it has wavefrong of 64.

                  Who is right who is wrong?

                  Max number of wavefronts per CU changed acordingly: v2.4 said 64 wavefronts per CU is max, v2.5 says 32 wavefronts per CU is max.

                  Again, who is right in this?

                  At least number of workgroups, max workgroup size and CU number remained the same ;).

                   

                   

                  Driver used: 11.12 mobile version (it's C-60 based netbook).

                    • Re: AMD APP Profiler 2.5 is now available
                      lbin

                      Hi Raistmer

                       

                      Thanks for using APP Profiler.

                      3. We've added a few new fields to occupancy output file which caused the incompatibility. Sorry for the inconvenience.

                      4. Loveland has wave size of 64. Max wave per CU is 49.

                        • Re: AMD APP Profiler 2.5 is now available
                          Raistmer

                          Thanks for info.

                           

                          1. Bug with ICC-compiled binaries trace is fixed for VS plugin, thanks a lot!

                           

                          Unfortunately, I was too optimistic in that. Actually bug still here, but I unintentionally found workaround it appears.

                          What bug: if ICC project is startup one APP Profiler plugin for VS2008 can't start API trace session. In this APP Profiler version too.

                          And what workaround: Luckely I have solution with few projects inside and one of the projects is MS VC based one, not ICC one. So when MS VC project is startup one trace session dialog opens OK and then one can chose ICC compiled binary of different project instead of binary of startup project. This bug is clearly plugin-based and has nothing to do with profiler itself, but adds some inconvience (and w/o such workaround it makes impossible to run trace session for ICC project in VS). Hope it will be eliminated in next release.

                      • Re: AMD APP Profiler 2.5 is now available
                        sherifbadr

                        i have AMD Sapphire HD 7950 OC when i open Amd SADK APP V 2.7  it never recognize my Kernel .

                        Am i install AMD APP Profiler v2.5 To Compile my kernel or any solutions

                        i looked in its option i found that the latest card is 6970 i never found the 7000 series or my kernel name "Tahiti"

                        i have amd catalyst suit v12.4

                        HELP me plz with the right steps

                        i want to render on my gpu

                        • Re: AMD APP Profiler 2.5 is now available
                          chevydevil

                          Hello, using the profiler for GPU performance counters doens't work for me with Catalyst 12.10 APP SDK 2.7 Ubuntu 12.04 x64 and a Radeon 7870 or Radeon 5870. The command line profile yust quits without any message when reaching the kernel execution and when I use CodeXL the only error message is: "Failed to generate profile result".

                          I tried different kernels. CL_QUEUE_PROFILING_ENABLE was also set when creating the command queues.

                          Any suggestions?

                           

                          Update: I tried some of the samples. The SimpleGL example could be profiled. Nbody and Fluid2D not.

                          • Re: AMD APP Profiler 2.5 is now available
                            tomjackson

                            Could you please try our latest developer tool CodeXL? We've fixed a few issues in the GPU profiling back end related to Catalyst compatibility. CodeXL also provides a nice GUI for the GPU profiling on Linux. Now it is easy to use Linux for all purposes.

                             

                             

                             

                            -------------------------

                            [url=http://www.hotbunshair.com]hotbunshair.com[/url]"