Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Adept II

How to debug a crash


Often, when I run my application... the OpenCL kernel crash !
I have no idea where the problem is because I have a lot of OpenCL code in this kernel. So, I'm searching since 2 weeks... but I have "no idea" (really no idea) of the problem !
So, do you have ideas to help me to debug this ?

11 Replies

Have you tried GDB? printf can also help in locating the region where crash might be occuring.

Sometimes I try doing debug for a single thread. That reduces some complexity



I only have a Windows 7 computer ! So, GDB is not useful !

Also, I use printf a lot, but without success !!


My problems have appear since I use "float3" instead of typedef struct { float x,y,z; } myfloat3; ....

I often pass float3 by pointer, like this, is it allowed ?


float3 a = ... mymethod(&a); void mymethod(float3* val) { (*a).x += 1; }


I would strongly recommend that you install Cygwin on your windows system, run the code on the CPU (by choosing CL_DEVICE_TYPE_CPU when you create your context), and then debug using both GDB and Visual Studio.


To Debug using Visual Studio, add a call to clFinish() after you enqueue each kernel.  Step through the program in the Visual Studio debugger, stopping after each clFinish() call.  This will let you identify which kernel is crashing.


Once you know which kernel is crashing, change to program to compile with the "-g" flag (in your statement).  You can then use GDB to step through the kernel that is the problem.  Be sure to set CPU_MAX_COMPUTE_UNITS=1 and read the instructions in chapter 3 of the Stream SDK Programming Guide which tell you how to use GDB.


Hope this helps.


And yes, it is legal to pass a float3 as a pointer to a function.


Thanks Alan

In the past I have try to use ... it is GDB but integrated in Visual Studio.

To be honnest, it has not been 100% successful, I have just face 1 or 2 problems. By example the "line number" where incorrect ... they were multiplied by 2 !!!

I'm sure that we a few test, we can use it to debug opencl kernel on visual studio... the developer of this tool was really reactive to my requests.

Have you ever test it ? It will be fine if someone from AMD can spend some time on this ?



Finaly I have use MinGW to debug it and it sounds that it works

But the result are strange... maybe it is a bug in the SDK :

I show you here the stack trace, the variables and the method that crash !

I'm able to see all the values of the method... so maybe it mean that there is no error there ! It sounds that it crash on a "native" OpenCL method from the SDK !


The stack trace --------------- #0 0xf4e30000 in ?? () #1 0x04e334b8 in BBoxIntersectP (rayOrig=..., invRayDir=..., mint=9.99999975e-005, maxt=inf, pMin=..., pMax=...) at #2 0x04e33819 in Intersect_Mesh (ray=0x1227fc30, rayHit=0x1227fb60, scene=0x1227fd70) at #3 0x04e339e2 in Intersect (scene=0x1227fd70, ray=0x1227fc30, rayHit=0x1227fb60) at #4 0x04e37870 in Trace (scene=0x1227fd70, startRay=0x1227fd40, radiance=0x1227fd20) at #5 0x04e388ab in __OpenCL_RenderPath_kernel (pixelsCurrentFrame=0x12c50040, pixels=0x6cc0040, paths=0x130e0040, camera=0x661baf0, sky=0x999f6c8, sphereLight=0x9ae7000, sphereLightCount=0, areaLight=0x11066d00, areaLightCount=1, materialReference=0x9aed000, materialReferencesCount=8, bsdfParametersBuffer=0x9aee000 "", pixmapsBuffer=0x9aef000 "", texMaps=0x9af0000, vertices=0x9ae8000, tris=0x9aea000, triangleCount=36, nodeCount=54, bvhTree=0x9aeb000, useBVH=1, width=664, height=597, currentSample=0, workOffset=0, workAmount=396408) at #6 0x04e38e87 in __OpenCL_RenderPath_stub ()from OCLAE99.tmp.dll #7 0x0d5013fc in clGetSamplerInfo () from c:\Program Files (x86)\ATI Stream\bin\x86\atiocl.dll #8 0x1228faa0 in ?? () #9 0x0d5018d7 in clGetSamplerInfo () from c:\Program Files (x86)\ATI Stream\bin\x86\atiocl.dll #10 0x09ad1c00 in ?? () The variables ------------- (gdb) print BBoxIntersectP::maxt $1 = inf (gdb) print BBoxIntersectP::mint $2 = 9.99999975e-005 (gdb) print BBoxIntersectP::rayOrig $3 = {s0 = 15.0219078, s1 = 10.1180973, s2 = 98.1133728} (gdb) print BBoxIntersectP::rayDir No symbol "rayDir" in specified context. (gdb) print BBoxIntersectP::invRayDir $4 = {s0 = -1.94436169, s1 = -3.78819013, s2 = -1.22553885} (gdb) print BBoxIntersectP::pMin $5 = {s0 = -28.0000992, s1 = -9.99999975e-005, s2 = -20.0000992} (gdb) print BBoxIntersectP::pMax $6 = {s0 = 28.0000992, s1 = 40.0000992, s2 = 20.0000992} The method ---------- int BBoxIntersectP(float3 rayOrig, float3 invRayDir, float mint, float maxt, float3 pMin, float3 pMax) { float3 l1 = (pMin - rayOrig) * invRayDir; float3 l2 = (pMax - rayOrig) * invRayDir; float3 tNear = fmin(l1, l2); float3 tFar = fmax(l1, l2); float t0 = max(max(max(tNear.x, tNear.y), max(tNear.x, tNear.z)), mint); float t1 = min(min(min(tFar.x, tFar.y), min(tFar.x, tFar.z)), maxt); return (t1 > t0); }


I also got this information from GDB ! What is strange is that l1 and l2 are local variables !



(gdb) up #1 0x06b534d0 in BBoxIntersectP (currentNode=0, stopNode=54, i=10, rayOrig=..., invRayDir=..., mint=9.99999975e-005, maxt=inf, pMin=..., pMax=...) at C:\Users\polar01\AppData\Local\Temp\ 1000 float3 tNear = fmin(l1, l2);^M 0x06b53492 <BBoxIntersectP+258>: 0f 28 5d 88 movaps -0x78(%ebp),%xmm3 0x06b53496 <BBoxIntersectP+262>: 8d 8d 38 ff ff ff lea -0xc8(%ebp),%ecx 0x06b5349c <BBoxIntersectP+268>: 89 0c 24 mov %ecx,(%esp) 0x06b5349f <BBoxIntersectP+271>: f3 0f 11 5c 24 04 movss %xmm3,0x4(%esp) 0x06b534a5 <BBoxIntersectP+277>: 66 0f 70 e3 01 pshufd $0x1,%xmm3,%xmm4 0x06b534aa <BBoxIntersectP+282>: f3 0f 11 64 24 08 movss %xmm4,0x8(%esp) 0x06b534b0 <BBoxIntersectP+288>: 0f 12 db movhlps %xmm3,%xmm3 0x06b534b3 <BBoxIntersectP+291>: f3 0f 11 5c 24 0c movss %xmm3,0xc(%esp) 0x06b534b9 <BBoxIntersectP+297>: f3 0f 11 4c 24 10 movss %xmm1,0x10(%esp) 0x06b534bf <BBoxIntersectP+303>: f3 0f 11 44 24 14 movss %xmm0,0x14(%esp) 0x06b534c5 <BBoxIntersectP+309>: f3 0f 11 54 24 18 movss %xmm2,0x18(%esp) 0x06b534cb <BBoxIntersectP+315>: e8 30 cb ff ef call 0xf6b50000 => 0x06b534d0 <BBoxIntersectP+320>: 83 ec 04 sub $0x4,%esp 0x06b534d3 <BBoxIntersectP+323>: f3 0f 10 85 38 ff ff ff movss -0xc8(%ebp),%xmm0 0x06b534db <BBoxIntersectP+331>: f3 0f 10 8d 3c ff ff ff movss -0xc4(%ebp),%xmm1 0x06b534e3 <BBoxIntersectP+339>: f3 0f 10 95 40 ff ff ff movss -0xc0(%ebp),%xmm2 0x06b534eb <BBoxIntersectP+347>: f3 0f 11 85 68 ff ff ff movss %xmm0,-0x98(%ebp) 0x06b534f3 <BBoxIntersectP+355>: f3 0f 11 8d 6c ff ff ff movss %xmm1,-0x94(%ebp) 0x06b534fb <BBoxIntersectP+363>: f3 0f 11 95 70 ff ff ff movss %xmm2,-0x90(%ebp)



Does the AMD Team has some news about my "crash" ? I'm still trying to fix the problem, but it is difficult to check what is happening in your generated DLL !

Please, can you advice and telling me the status ?




I still have no news from Himanshu & Naganna 😞 Please, can you at least tell me the status to know in which direction I have to go ?

Either I do a rollback without "float3, ..." either I continue to search for the problem ?

Please, avoid me to lost one more week



Hi Viewon1

  The stack trace does make it look like you are inside a  built in function.  Can you see what line it is crashing on in your function BBoxIntersectP?  Please try to simplify your program to isolate the problem.

Also please try to use Cygwin/gdb, that is the combination that we support and test with.






Thanks Alan,

I have try to simplify it, I work on this problem since 10 days now 😞

I have sent the software to Himanshu. He is from the AMD Team and/or work for AMD ?

So, I have debug a lot and here is what I know :

1 - it crash at the following line : "float3 tFar = fmax(l1, l2);" (See my code in the previous post). But it is impossible to do an access violation there !

2 - My software work well with Intel SDK (Just for the test and to tell that ti should work)

3 - I have remove all the compilations settings ("-g") and in some "mode" it works.

4 - I have all theses crash since I have switch from struct {float x,y,z} myfloat3; to float3 ! It is not the first time I try to use float3/float8 but each time I fail to run my application with the AMD SDK 😞

5 - I have review, test, print... etc... all my structures for correct datas and alignment... all sounds fine !

So, really I don't know how to debug this. I think that the problem is in the AMD SDK ! Simply because it crash in line where it is impossible for me to do an access violation, like "mycal = clamp(f, 0.f, 1.f)" with local variables !

Another strange stuffs, I have just receive my HD6950 and when I compile I just got on error message from the compiler "Error: Creating kernel RenderPath failed!" !!


Really, I don't know what to do now !!! If you can help me ?