cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

oscarbarenys1
Adept II

How can gdebugger support GPU debugging even with single GPU?

If HW doens't support debugging..

Seeing graphics Core next seems AMD addes GPU HW debugging support similar to CPUs and GPUs from Nvidia.. but how can it be that gdebugger supports even with current cards from AMD: see "gDEBugger‘s single-GPU debug capability impresses at SIGGRAPH"

even single GPU doesn't make sense as when a breakpoint is achieved GPU would lockup as Nvidia says is reason for needing two GPUs in Nsight..

can AMD engineers shed light on this.. or what will then be core next debugging support add?

0 Likes
8 Replies

Current GPU debugging implementations on AMD hardware is a software solution and not hardware solution. While during debugging you run on the hardware, the hardware itself does not accelerate the process.
0 Likes

Hi Micah,

Sorry but your answer is ambiguous. What does "run on hardware" but "not accelerated by hardware" means?

You say it is software solution - so is the code actually running on the GPU computing units when debugged, or is it simulated in some way?

0 Likes

anton_a_1977,
There is no hardware debugging features in AMD GPUs, so the hardware does not accelerate any debugging commands. However, when you inspect a variable, the results are from the hardware and not a software emulator or the CPU. So the results of debugging are the same as the results of running your program.
0 Likes

"so the hardware does not accelerate any debugging commands"

 

Sorry again, but I just don't understand what it means. What is the difference between running OpenCL code and "accelerating" OpenCL code on a GPU?

Without hardware support you can't actually make the GPU stop, right, so what happens when debugger is stopped on a breakpoint? Is this recorded information from run that has already finished? And when I step then? Is it run again?

I ask for real reasons, not only curiosity. I have some troubles in my code due to maybe race conditions between work items. So I started debugging and I'm puzzled to see values of variables in all work items. How can this be? Order of running work-items is not defined, so in work item 47, has work item 135 already run always before so I see its values? I'm trying to understand if I can use the debugger to debug my race condition problems (or at least catch them happening)

0 Likes

OpenCL debug is run on the CPU, always. Most likely it uses the same LLVM architecture for debugging as it does for compiling kernels. When you debug a GPU application, you will see that the GPU actually is not used. In some manner, your application will run different, than on an actual GPU environment, but it does let you find a lot of glitches inside the code.

Sync issues might appear in a lot more "ordered" manner, than on a GPU, but it does let you play around a little. If you really don't know what causes sync issues, insert sync commands at every step, and start taking them out one-by-one.

0 Likes

Originally posted by: Meteorhead OpenCL debug is run on the CPU, always. Most likely it uses the same LLVM architecture for debugging as it does for compiling kernels. When you debug a GPU application, you will see that the GPU actually is not used. In some manner, your application will run different, than on an actual GPU environment, but it does let you find a lot of glitches inside the code.

 

Sync issues might appear in a lot more "ordered" manner, than on a GPU, but it does let you play around a little. If you really don't know what causes sync issues, insert sync commands at every step, and start taking them out one-by-one.

 

 

Are you sure about this? It is very worrying!

 

MicahVillmow said in answer to my question:

 

  " the results are from the hardware and not a software emulator or the CPU"

 

And you say the results are actually from CPU and not GPU?

 

0 Likes

I'm sorry, this info I concluded from another topic. Micah did make this pretty clear here. My bad.

Can I ask then how is this achieved? Somebody asked me this question earlier, and I said that most likely kernels end at the breakpoint, save all registers somewhere, dump this info back to host, and load them again when resuming execution. OR do breakpoints use globalsync inaccessible as of now by OpenCL?

0 Likes

Since there is no hardware support for debugging on the GPU(no breakpoints, interruptions, etc...), the debugger is what is called a software debugger. What happens is that when you set a breakpoint, the debugger regenerates your program to only run up to that point, writes out all the data to an internal buffer, exits and then displays all of the data to the user. To the user, this looks like you are debugging on the GPU, but the debugging is happening on both the CPU and the GPU. This is why I said the debugger is not hardware accelerated, but the results are still from running the program on the GPU.
0 Likes