Before I begin, let me provide and overview of what I am using OpenCL for (for context):
I have a program which finds out which piece of geometry from set A are intersected by individual items in set B using the stencil buffer, where an item from set B is used to carve out stencil regions where it intersects A. Note that sections of A are colored by on an 1D index (I basically convert a 1D index to a RGB value); so if only certain colors are visible thanks to the stenciling operation, then only the corresponding regions are intersected. This can be found out by looping through all displayed pixels, finding visible (i.e. non-background color) pixels, and converting them back to a 1D index.
Since it is quite slow to do this all via the CPU (since I have to do this for each item in set B), I use OpenCL. Here, my OpenCL program requires a frame buffer object where I do the rendering to. This FBO is has a texture object that's the same size as the viewport, and if one renders the intersections of A and an item of set B, then it is a easy to count the colored pixels by using a kernel object.
Let me list my system specs: MacbookPro 8,2, Core i7 2.2 GHz, AMD Radeon HD 6750M, OS X 10.8.2. It is important to note that the following code ran fine in previous version of Mac OS X (and/or previous versions of dev tools? not sure if dev tools update caused a problem!) but now it stalls, and I have no idea why. Anyway I am going to leave out other details that probably irrelevant, i.e. openGL related code, unless they would help in understanding my code.
I'm attaching a special cpp file that has all of the relevant C++ and kernel code. There are some stuff related to my program, but you can focus on the openCL calls just to see if there is anything out of the ordinary. Here are the sequence of events:
1. Program starts up (note; this is a QT program).
2. QT window that has relevant openCL "handler" object starts up afterwards.
3. Each time this OpenCL QT window updates, it updates the handler object, which runs the kernel.
4. Kernel runs twice, where each iteration analyzes all intersections between objects in set A and B at a per-pixel level.
5. When I bring the window with the openCL handler to the foreground, the kernel runs an additional time and stalls when testing for intersections for an object in set B.
#0 0x00007fff918ab122 in __psynch_mutexwait ()
#1 0x00007fff8ca5fd9d in pthread_mutex_lock ()
#2 0x00000001106b9cbd in gldFlushQueue ()
#3 0x000000010b3d393f in IOAccelContextFinishResourceSysMem ()
#4 0x000000010b3df5b0 in gpumAcquireFenceOnQueue ()
#5 0x00000001106c2fcd in gldCopyBufferDataWithQueue ()
#6 0x00007fff934ad6b3 in GCC_except_table49 ()
#7 0x00007fff934caf1b in clFinish ()
#8 0x00007fff96b850b6 in _dispatch_client_callout ()
#9 0x00007fff96b86723 in _dispatch_barrier_sync_f_invoke ()
#10 0x00007fff934caddb in clFinish ()
#11 0x00007fff934c7792 in clSetEventCallback ()
#12 0x00007fff934be1e6 in clEnqueueWriteBuffer ()
#13 0x0000000100020bde in CLHandler::update (this=0x166a08fe0, tagged=@0x7fff5fbfe3a0, w=761, h=711) at clhandler.cpp:310 <-- stalls at call to clEnqueueWriteBuffer
#14 0x000000010002f1f1 in CustomBladesGLWidget::render (this=0x10724c850, indexMode=true, offset=139704, numPnts=4524, blobID=26, bladeIdsIntersected=@0x7fff5fbfe3a0) at customBladesGLWidget.cpp:943
Anyway, sorry for being so verbose. I didn't want to leave out too many details. I hope that the gdb information will come in handy. All of the other details I provided and are for context.
forumsquestion.cpp.zip 3.3 KB