cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

kenobrien
Journeyman III

Initialising AMDTPowerProfileAPI returns AMDT_ERROR_FAIL

Hi,

I have built an application which uses AMDTPowerProfileAPI. It compiles successfully, but when I initialise this library as below:

AMDTResult hResult = AMDTPwrProfileInitialize(AMDT_PWR_PROFILE_MODE_ONLINE);

I receive an error code in hResult of "AMDT_ERROR_FAIL".

I'm running Centos 6.6. I did have to statically link to a newer libc to get this library to link so perhaps something is corrupted at runtime.

It is not feasible to update the target machine. None of the other error codes are triggered. What could cause this error?

Best Regards,

Ken

0 Likes
16 Replies
rajeebbarman
Staff

Hi Ken,

Sorry for our delayed response. To investigate the problem further can you please provide following details.

  1. Which platform you are using. Do you have discrete GPU connected as well?
  2. What is the output of command “lsmod | grep pcore”
  3. Did you try executing power profiler commandline tool? If not, can you please try executing  following command from
    codexl bin directory  ./CodeXLPowerProfiler -l

--Aalok

0 Likes

Hi,

Thank you for your response.

1. I am on Centos 6.6 Linux. I have a discrete AMD GPU connected. The CPU is Intel based. Is this an issue?

2. There is no output. Am I missing a driver?

3. The output of CodeXLPowerProfiler -l is : Failed to initialize the driver. (error code 0x80004005). I have fglrx loaded.

Can you pin down the specific requirements in terms of compiler/runtime required to build using this library.

I have been able to link my executable in the traditional way so I have ruled that out as the problem. I will attempt to install the latest AMD graphics driver today.

Many thanks,

Ken

0 Likes

Hi Ken,

Thanks for your inputs.

Can you please try  following.

  1. It looks like driver is not installed in your machine. Can you please execute "sudo ./AMDTPwrProfDriverInstall.run" command (from codexl bin directory) to install the power profiler driver,
  2. Execute “lsmod | grep pcore” command, to ensure that pcore module installed in the system.
  3. Now try "CodeXLPowerProfiler -l" (from codexl bin directory). We are getting error code 0x80004005 because driver is not installed.

Also, can you please tell which AMD GPU you are using. Using Intel based CPU should not be an issue.

--Aalok

Hi,

Thanks very much for your reply. It was very helpful.

1. I installed the driver.

2. smod | grep pcore gives "pcore                  32525  0"

3. CodeXLPowerProfiler -l gives "Power Profiler is not supported on the current platform (error code 0x80080011)."

We're running an AMD R9 295X2.

Ken

0 Likes
aalok
Staff

Hi Ken,

Thanks for reply.

Can you  please provide me the device id for your GPU.

Following Linux command with help to get the device ID

     lspci -vnn | grep VGA -A 12

--Aalok

0 Likes

Hi Aalok,

Thanks for your reply. We are running on a virtual system. lspci doesn't have any output whatsoever. We will try to expose the PCIe card to the virtual OS.

Best Regards,

Ken

0 Likes
aalok
Staff

Hi Ken,

CodeXL doesn't support Power Profiling on virtual environment due to limitations in accessing relevant physical counters from guest OS.

So in this case the error thrown by the Power Profiler is expected behaviour.

--Aalok

0 Likes

Hi Aalok,

Many thanks for your support with this. I will see if it's possible to have the card moved to a bare metal system.

Best Regards,

Ken

0 Likes

Hi Aalok,

Turns out the machines are not virtual.

Can you suggest reasons why I'd get an unsupported platform error otherwise?

We now have output from lspci.

08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Vesuvius [Radeon R9 295X2] [1002:67b9] (prog-if 00 [VGA controller])

        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:0b2a]

        Flags: bus master, fast devsel, latency 0, IRQ 40

        Memory at a0000000 (64-bit, prefetchable) [size=256M]

        Memory at b0000000 (64-bit, prefetchable) [size=8M]

        I/O ports at c000 [size=256]

        Memory at fb600000 (32-bit, non-prefetchable) [size=256K]

        Expansion ROM at fb640000 [disabled] [size=128K]

        Capabilities: [48] Vendor Specific Information: Len=08 <?>

        Capabilities: [50] Power Management version 3

        Capabilities: [58] Express Legacy Endpoint, MSI 00

        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+

        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>

We're getting close. Any ideas?

Best Regards,

Ken

0 Likes

Hi Ken,

Thanks for sharing the results.

Can you please share the output from /tmp/PwrBackendTrace.txt file.This file is created when we execute power profiling and capture some profiling info,  we might get some clue to  debug this failure.

Regards

--Aalok

0 Likes

Hi Aalok,

Thanks for your reply. Here is the contents of that file.

IsDgpuMMIOAccessible:1398-

Command reg 0x100007

GetAvailableSmuList:1642- dGPU mmio access 1

IsSmuLogAccessible:1470- IsSmuLogAccessible: SMU_PM_STATUS_LOG_START failed

GetAvailableSmuList:1671- Device:B7/D0/F0 dev-id:0x67b9, hw-type:6, dev-type:1, model:Hawaii, sname:Hawaii Gemini, ip-ver: 71, access:0

IsDgpuMMIOAccessible:1398-

Command reg 0x100007

GetAvailableSmuList:1642- dGPU mmio access 1

IsSmuLogAccessible:1470- IsSmuLogAccessible: SMU_PM_STATUS_LOG_START failed

GetAvailableSmuList:1671- Device:B8/D0/F0 dev-id:0x67b9, hw-type:6, dev-type:1, model:Hawaii, sname:Hawaii Gemini, ip-ver: 71, access:0

PrepareSystemTopologyInfo:1225-  AMDT_WARN_SMU_DISABLED Apu Smu not accessible

PrepareSystemTopologyInfo:1477-  AMDT_WARN_SMU_DISABLED for dGPU-1

PrepareSystemTopologyInfo:1477-  AMDT_WARN_SMU_DISABLED for dGPU-2

AMDTPwrProfileInitialize:1655- Return code: 0x80080011 counters 0

If there's any more information I can provide, just let me know.

Best Regards,

Ken

0 Likes

Hi Ken,

Thanks for your response.

We will try to replicate this issue in our lab. Please give me some time i will come back to you on this.

Regards

Aalok

0 Likes

Thank you very much. I look forward to your findings.

0 Likes

Hi,

Have you been able to replicate this issue in your lab?

Best Regards,

Ken

0 Likes

Hi Ken,

Thanks for your reply.

Till now we are not able to reproduce this issue.

Will try it more times, will get back to you on this..

Regards

Aalok

0 Likes

Hi Ken,

Thanks for your patience.

Unfortunately we are not able to reproduce this issue in our local setup.

I need some more information from your end to debug it further.

  • Execute the following command

             /sbin/lspci -vnnx | grep VGA -A 20

             This will dump the PCI header information on the screen for VGA controllers. Pls attach that dump

  • Please verify the driver is installed properly via command "lsmod | grep pcore"
  • Execute the command "./CodeXLPowerProfiler -l" and attach the file generated at   " /tmp/PwrBackendTrace.txt"
  • How many PCI slots are available ? Last pci details you attached shows two dGPU attached in your system. Is this issue is seen with single dGPU also? Can you try reproducing this issue  with only one GPU at first PCI slot
  • Make sure your catalyst driver is updated once you change the configuration

Regards,

Aalok

0 Likes