cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

janten
Journeyman III

[Bug Report] ~cl::CommandQueue() causes application to hang under some circumstances

When enqueueing an OpenCL command that waits for a cl::Event in a cl::CommandQueue any subsequent calls of clReleaseCommandLine (as called by ~cl::CommandQueue();) will cause the application to hang forever, even if the retain count of the Command Queue is > 0.

The bug may be caused by clReleaseCommandLine performing an implicit finish() on the Command Queue instead of a flush(), as per the OpenCL standard.

Please see the attached code for an example.

// // main.cpp // amdtest // // Created by Jan-Gerd Tenberge (janten@gmail.com) on 28.11.11. // Copyright (c) 2011 Westfälische Wilhelms-Universität Münster. // All rights reserved. // // This examples shows a possible bug in the AMD APP SDK where the // destructor of a cl::CommandQueue halts for an infinite time if // any OpenCL command waiting for a cl::Event is waiting on the queue. #include <iostream> #include <vector> #include <boost/thread.hpp> #include <boost/bind.hpp> #include <boost/function.hpp> #define __CL_ENABLE_EXCEPTIONS #include <CL/cl.hpp> void setStatusComplete(cl::UserEvent event, cl::Buffer buffer, cl::CommandQueue queue); int main (int argc, const char * argv[]) { boost::thread* threadp = NULL; try { std::vector<cl::Platform> platforms; cl::Platform::get(&platforms); cl_context_properties props[] = { CL_CONTEXT_PLATFORM, (cl_context_properties)platforms[0](), 0 }; /* * We had no AMD GPU for testing, it is therefore unknown whether the bug * affects only CPUs or all devices supported by AMD APP's OpenCL implementation. */ cl::Context context(CL_DEVICE_TYPE_CPU, props); std::vector<cl::Device> devices = context.getInfo<CL_CONTEXT_DEVICES>(); cl::CommandQueue queue(context, devices[0]); int i = 10; cl_uint memSize = sizeof(int); cl::Buffer input(context, CL_MEM_READ_WRITE, memSize); std::vector<cl::Event> eventWaitList; cl::UserEvent dataReceipt(queue.getInfo<CL_QUEUE_CONTEXT>()); eventWaitList.push_back(dataReceipt); std::cout << "Waiting for dataReceipt to be of status CL_COMPLETE" << std::endl; // Write Buffer after setStatusComplete has finished in a different thread queue.enqueueWriteBuffer(input, CL_FALSE, 0, sizeof(int), &i, &eventWaitList, NULL); { /* * Since the bug is triggered by clReleaseCommandQueue run * from ~cl::CommandQueue(), this is sufficient to trigger it. */ cl::CommandQueue q2 = queue; /* Launch setStatusComplete in another thread. * * The method shown here is a stub. In real-world usage setStatusComplete * receives data over a network connection and calls event.setStatus(CL_COMPLETE); * as soon as all data has been retrieved and is ready for upload to the device. * * boost::bind() will try to copy the object queue, calling the CommandQueue destructor, * this will cause the application to hang. The actual execution of the thread one line below * will not be reached, causing a deadlock since setStatus(CL_COMPLETE) will never be * called on dataReceipt. * * Expected result: The thread should be started, triggering the status change of dataReceipt * after three second. This should in turn cause the writerBuffer command to be executed. * The expected result can be observed by using the NVIDIA SDK. */ // boost::function0<void> func = boost::bind(&setStatusComplete, dataReceipt, input, queue); // threadp = new boost::thread(func); } /* * Implicit call of ~cl::CommandQueue(); triggers infinite wait here * if cl::CommandQueue q2 = queue; is used. * Possible cause: clReleaseCommandQueue performs implicit finish() * instead of flush(). */ } catch (cl::Error& err) { std::cout << "Error " << err.err() << " " << err.what() << std::endl; } if (threadp) { threadp->join(); } delete threadp; return 0; } void setStatusComplete(cl::UserEvent event, cl::Buffer buffer, cl::CommandQueue queue) { std::cout << "Setting dataReceipt to CL_COMPLETE in 3 seconds" << std::endl; sleep(3); // This should trigger the execution of the WriteBuffer command // enqueued in the main method. event.setStatus(CL_COMPLETE); }

0 Likes
3 Replies
kentzo
Journeyman III

I confirm this behavior on AMD platform only. Looks like each clReleaseCommandQueue causes clFinish which causes undefined behavior when being called from cl_event's callback.

I was not able to reproduce it neither on Apple or nVidia platform.

0 Likes

Thank you for the feedback, we are looking into this issue.

0 Likes

This issue has already been addressed internally and a fix will be available in the next releases of the runtime.

We cherish this kind of feedback.

0 Likes