2 Replies Latest reply on Dec 5, 2011 2:04 PM by homemadejam

    clEnqueueCopyBuffer Segfault

      Unexplained segfault. Runs fine on Intel SDK and Nvidia

      I've been running a project that is fine under the Intel OpenCL SDK and the Nvidia SDK. This is running on an Intel Q9500.

      I have the following line which is segfaulting as follows (in attached code).

      I'm not sure why this is happening, I have an almost identical line above for h_A and d_A which works fine. Both are allocated in the same way and are not null (along with all the other args) on entry to the function. I have tried reducing the number of bytes to enqueue to no avail. I also noticed that a few of the samples segfault too.

      I am running Kubuntu 11.10 with 2.6.38-12-generic kernel. 2.5 APP SDK.

      Any ideas or workarounds?



      ciErrNum = clEnqueueCopyBuffer(m_commandQueue, h_B, d_B, 0, 0, sizeof(char)*40*40, 0, NULL, &clEnqCopyBuffer1.CLEvent()); Program received signal SIGSEGV, Segmentation fault. 0x00007ffff144d7f9 in clEnqueueCopyBuffer () from /opt/AMDAPP/lib/x86_64/libamdocl64.so

        • clEnqueueCopyBuffer Segfault

          My general rule of thumb is that if it's crashing, it's because you haven't looked hard enough for your own bugs yet ...

          On the other hand, I find the CPU and linux AMD GPU drivers crash more often than the windows AMD GPU driver; I haven't ruled out bugs in my code, but I don't have any simple examples which are easy to validte.

          Which SDK examples are crashing?  What device are you using?



            • clEnqueueCopyBuffer Segfault

              Having looked more closely, it's the GL examples that aren't running correctly: all the others work fine. I don't think GL works properly on my machine so I think this can safely be ignored.
              I'll try and paste the relevant extracts below, hopefully you can shed some light. It segfaults right on the last line.



              // Allocate a 40 by 40 matrix x2 char* a_data = (char*)malloc(sizeof(char)*40*40); char* b_data = (char*)malloc(sizeof(char)*40*40); // Create the dest mem char *c_data = (char*)malloc(sizeof(char)*40*40); cl_mem h_A = clCreateBuffer(m_ctx, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(char)*40*40, a_data, &ciErrNum); cl_mem h_B = clCreateBuffer(m_ctx, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(char)*40*40, b_data, &ciErrNum); cpProgram = clCreateProgramWithSource(m_ctx, 1, (const char **)&src, &kernel_length, &ciErrNum); ciErrNum = clBuildProgram(cpProgram, 0, NULL, NULL, NULL, NULL); // Create the kernel m_kernel = clCreateKernel(cpProgram, "matrixMul", &ciErrNum); // Build the device mem cl_mem d_A; cl_mem d_B; cl_mem d_C; d_A = clCreateBuffer(m_ctx, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(char)*40*40, h_A, &ciErrNum); d_B = clCreateBuffer(m_ctx, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(char)*40*40, h_B, &ciErrNum); d_C = clCreateBuffer(m_ctx, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, sizeof(char)*40*40, c_data, &ciErrNum); ciErrNum = clEnqueueCopyBuffer(m_commandQueue, h_A, d_A, 0, 0, sizeof(char)*40*40, 0, NULL, &clEnqCopyBuffer.CLEvent()); ciErrNum = clEnqueueCopyBuffer(m_commandQueue, h_B, d_B, 0, 0, sizeof(char)*40*40, 0, NULL, &clEnqCopyBuffer1.CLEvent());