I would like to step through an opencl.net kernel I built. I tried CodeXL 1.2 but it does not hit any breakpoints I added to a
"test.cl" file. Has anyone had any luck doing this?
Solved! Go to Solution.
Hi sunsetquest,
When an OpenCL program is created via clCreateProgramWithSource, CodeXL should intercept that function call and compare the source string against available *.cl files (available = part of the project in Visual Studio, or present in the "Kernel sources" folder in standalone CodeXL). However, if the cl source does not match any of those files (or there are no such files), CodeXL still logs the creation of the program - some applications have OpenCL kernels embedded in the source as string constants, or even generate OpenCL code on-the-fly, and CodeXL supports those as well:
When an CodeXL thinks the application "generates" OpenCL source code, it will create a temporary cl source file and use that as the source for the kernel, showing it if you try to debug it.
CodeXL also has API-level debugging - it allows you to set breakpoint at OpenCL API functions, such as clCreateProgramWithSource or clEnqueueNDRangeKernel.
To debug an OpenCL kernel, you can also set an API breakpoint on clEnqueueNDRangeKernel (via CodeXL's "New CodeXL breakpoint..." dialog) - this should show the API call in your code if you debug with CodeXL. Once you get there, simply press "Step in" (F11) to go into kernel debugging. This will also take you to the aforementioned temporary kernel source file. The same is true for clEnqueueTask, if you happen to use that API.
I may have misread your original post, and the workaround I described might not be necessary, please try the following:
1. In CodeXL's breakpoints dialog, set breakpoints on "clGetPlatformIDs", "clCreateContext" and "clCreateContextFromType"
2. Debug with CodeXL. One of the breakpoints, most likely clGetPlatformIDs, will be hit.
3. Now try undoing the workaround as I described in my original post.
4. Debug with CodeXL and the three breakpoints again. If none of the breakpoints are hit, the workaround was necessary, restore it (until you no longer need to use CodeXL). If the same breakpoint is hit again, the workaround was not needed and you can use CodeXL without it.
5. Once you're done verifying, you can remove those three breakpoints, and proceed to add clEnqueueNDRangeKernel as a breakpoint as explained in this post.
I hope this helps,
I accidently clicked "Assumed Answered" and cannot undo. Please ignore.
Hi sunsetquest,
There is a known issue with the way CodeXL performs interception of the OpenCL API with CLR (.NET) applications.
As a workaround:
1. Navigate to the CodeXL folder (usually, C:\Program Files (x86)\AMD\CodeXL\ )
2. Open the spies\ subdirectory
2a. If the application you are debugging is a 64-bit application, open the spies64\ subdirectory instead.
3. Copy OpenCL.dll and OpenCL.pdb to the folder where your debugged executable (.exe) resides.
3a. If you also use and want to debug OpenGL code, you need to copy opengl32.dll and opengl32.pdb from the spies directory to the same path.
4. Run the project as usual in CodeXL
5. When you are done using CodeXL, remove the files you copied from the application folder, or you won't be able to run the application outside of CodeXL.
Hope this helps,
Hi Uri, I tried your instructions and I think I'm closer. I copied the opencl.dll and opencl.pdb file into the folder where the .exe file is and ran the debug but its still not the hitting the breakpoints. I think its not hitting breakpoints because opencl.net loads the test.cl file via a string and then hands it off to the opencl wrapper.
Here 'source' is a string:
program = Cl.CreateProgramWithSource(_context, 1, new[] { source }, new[] { (IntPtr)source.Length }, out error);
...so there is no way for the CodeXL debugger to really know that a test.cl file exists. (this is just a guess of mine though)
I also briefly tried to get the workaround to work the 'Cloo' wrapper but I could not get it work there either - I did not spend much time with the Cloo wrapper however.
Thank you for your help. I am defiantly closer now.
Hi sunsetquest,
When an OpenCL program is created via clCreateProgramWithSource, CodeXL should intercept that function call and compare the source string against available *.cl files (available = part of the project in Visual Studio, or present in the "Kernel sources" folder in standalone CodeXL). However, if the cl source does not match any of those files (or there are no such files), CodeXL still logs the creation of the program - some applications have OpenCL kernels embedded in the source as string constants, or even generate OpenCL code on-the-fly, and CodeXL supports those as well:
When an CodeXL thinks the application "generates" OpenCL source code, it will create a temporary cl source file and use that as the source for the kernel, showing it if you try to debug it.
CodeXL also has API-level debugging - it allows you to set breakpoint at OpenCL API functions, such as clCreateProgramWithSource or clEnqueueNDRangeKernel.
To debug an OpenCL kernel, you can also set an API breakpoint on clEnqueueNDRangeKernel (via CodeXL's "New CodeXL breakpoint..." dialog) - this should show the API call in your code if you debug with CodeXL. Once you get there, simply press "Step in" (F11) to go into kernel debugging. This will also take you to the aforementioned temporary kernel source file. The same is true for clEnqueueTask, if you happen to use that API.
I may have misread your original post, and the workaround I described might not be necessary, please try the following:
1. In CodeXL's breakpoints dialog, set breakpoints on "clGetPlatformIDs", "clCreateContext" and "clCreateContextFromType"
2. Debug with CodeXL. One of the breakpoints, most likely clGetPlatformIDs, will be hit.
3. Now try undoing the workaround as I described in my original post.
4. Debug with CodeXL and the three breakpoints again. If none of the breakpoints are hit, the workaround was necessary, restore it (until you no longer need to use CodeXL). If the same breakpoint is hit again, the workaround was not needed and you can use CodeXL without it.
5. Once you're done verifying, you can remove those three breakpoints, and proceed to add clEnqueueNDRangeKernel as a breakpoint as explained in this post.
I hope this helps,
Hi Uri, Thank you for all your help. After some playing around it finally worked!!! The manual breakpoint and F11 you had me try fixed it. I can now debug c#/.net openCL applications using CodeXL… this rocks!
In a nutshell, after adding a CodeXL breakpoint at clEnqueueNDRangeKernel and starting debug, I would get a new tab with “No Source Available. No Symbols are loaded for any call stack frame.” When I would click the different options like Browse to find Source or Show Disassembly nothing would happen but then I noticed in the 'CodeXL Function Call history' that it appeared to be sitting at a breakpoint so I pressed F11 a few times and it jumped to the auto generated kernel in the temp folder you mentioned. From here, I could add breakpoints and debug as needed.
Some other notes…
Thank you again Uri for your persistent help on this. Your support is A+++++;. I was about to give up on this and just use printf for debugging my kernels but this is a 1000 times better. Thank you again.
I will post this thread in the OpenCL.Net user forums so it can hopefully help others.
Here is what I did to get the debugging to work for C# / Visual studio 2010. I am not sure if this is the best way but it is the only way I had any luck with:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using OpenCL.Net.Extensions;
using OpenCL.Net;
namespace OpenClDebugTest
{
class TestProgram
{
static void Main()
{
const int count = 2048;
// Lets create an random array of floats
var random = new Random();
float[] data = (from i in Enumerable.Range(0, count) select (float)random.NextDouble()).ToArray();
// Create a compute device, create a context and a command queue
Event event0; ErrorCode err;
Platform[] platforms = Cl.GetPlatformIDs(out err);
Device[] devices = Cl.GetDeviceIDs(platforms[0], DeviceType.Gpu, out err);
Device device = devices[0]; //cl_device_id device;
Context context = Cl.CreateContext(null, 1, devices, null, IntPtr.Zero, out err);
CommandQueue cmdQueue = Cl.CreateCommandQueue(context, device, CommandQueueProperties.None, out err);
// Create and build a program from our OpenCL-C source code
string programSource = @"
__kernel void doubleMe(__global float* input, __global float* output)
{
size_t i = get_global_id(0);
output = input + input;
};";
Program program = Cl.CreateProgramWithSource(context, 1, new[] { programSource }, null, out err);
Cl.BuildProgram(program, 0, null, string.Empty, null, IntPtr.Zero); //"-cl-mad-enable"
// Check for any compilation errors
if (Cl.GetProgramBuildInfo(program, device, ProgramBuildInfo.Status, out err).CastTo<BuildStatus>() != BuildStatus.Success)
{
if (err != ErrorCode.Success)
Console.WriteLine("ERROR: " + "Cl.GetProgramBuildInfo" + " (" + err.ToString() + ")");
Console.WriteLine("Cl.GetProgramBuildInfo != Success");
Console.WriteLine(Cl.GetProgramBuildInfo(program, device, ProgramBuildInfo.Log, out err));
}
// Create a kernel from our program
Kernel kernel = Cl.CreateKernel(program, "doubleMe", out err);
// Allocate input and output buffers, and fill the input with data
Mem memInput = (Mem)Cl.CreateBuffer(context, MemFlags.ReadOnly, sizeof(float) * count, out err);
// Create an output memory buffer for our results
Mem memoutput = (Mem)Cl.CreateBuffer(context, MemFlags.WriteOnly, sizeof(float) * count, out err);
// Copy our host buffer of random values to the input device buffer
Cl.EnqueueWriteBuffer(cmdQueue, (IMem)memInput, Bool.True, IntPtr.Zero, new IntPtr(sizeof(float) * count), data, 0, null, out event0);
// Get the maximum number of work items supported for this kernel on this device
IntPtr notused;
InfoBuffer local = new InfoBuffer(new IntPtr(4));
Cl.GetKernelWorkGroupInfo(kernel, device, KernelWorkGroupInfo.WorkGroupSize, new IntPtr(sizeof(int)), local, out notused);
// Set the arguments to our kernel, and enqueue it for execution
Cl.SetKernelArg(kernel, 0, new IntPtr(4), memInput);
Cl.SetKernelArg(kernel, 1, new IntPtr(4), memoutput);
Cl.SetKernelArg(kernel, 2, new IntPtr(4), count);
IntPtr[] workGroupSizePtr = new IntPtr[] { new IntPtr(count) };
Cl.EnqueueNDRangeKernel(cmdQueue, kernel, 1, null, workGroupSizePtr, null, 0, null, out event0);
// Force the command queue to get processed, wait until all commands are complete
Cl.Finish(cmdQueue);
// Read back the results
float[] results = new float[count];
Cl.EnqueueReadBuffer(cmdQueue, (IMem)memoutput, Bool.True, IntPtr.Zero, new IntPtr(count * sizeof(float)), results, 0, null, out event0);
// Validate our results
int correct = 0;
for (int i = 0; i < count; i++)
correct += (results == data + data) ? 1 : 0;
// Print a brief summary detailing the results
Console.WriteLine("Computed {0} of {1} correct values!", correct.ToString(), count.ToString());
Console.ReadKey();
}
}
//Sources:
// - examples from http://sa10.idav.ucdavis.edu/docs/sa10-dg-opencl-overview.pdf (Derek Gerstmann)
// - examples from https://openclnet.codeplex.com/ (Ananth Balasubramaniam)
}