I'm developing an application using two dimensional matrixes implemented as vector, this is not a problem, the application worked correctly.
I've trouble since I tried to parallelize the part of application wich do copy of row and columns of matrix. I created functions such as
void copyrow(cl_command_queue queue, cl_kernel kernel, cl_mem mat1, int row1, cl_mem mat2, int row2, int size1, int size2)
but I obtain a segmentation fault error. If the problem is not due to kernel implementation used by this function what could be the problem? Have I to pay attention to something?
I also noticed when I pass incorrect type the error visualize the signature type with pointers such as void copyrow(cl_command_queue* queue ...
but the signature is what I wrote above. Are there hidden pointers?
Can you post your definition of copyrow function?
Are you using clEnqueueWriteBufferRect or creating sub-buffers using clCreateSubBuffer?
as he has parameter kernel i assume that he use custom copy kernel.
I also tried with parameters of pointers type instead.
void copyrow(cl_command_queue queue, cl_kernel kernel, cl_mem mat1, int row1, cl_mem mat2, int row2, int size1, int size2) { size_t global_work_size[2]; global_work_size[0] = size1; global_work_size[1] = size2; clSetKernelArg(kernel, 0, sizeof(mat1), (void*) &mat1); clSetKernelArg(kernel, 1, sizeof(row1), (void*) &row1); clSetKernelArg(kernel, 2, sizeof(mat2), (void*) &mat2); clSetKernelArg(kernel, 3, sizeof(row2), (void*) &row2); clSetKernelArg(kernel, 4, sizeof(size2), (void*) &size2); clEnqueueNDRangeKernel(queue, kernel, 2, NULL, global_work_size, NULL, 0, NULL, NULL); clFinish(queue); } //function call copyrow(queue, copyrowkern, Z_old_buf, 0, Z_old_buf, 1, size1, width); //kernel function __kernel void copyrow(__global double* mat1, const int row1, __global double* mat2, const int row2, const int size2) { int i = get_global_id(0); int j = get_global_id(1); if(i == row1) mat1[row1*size2+j] = mat2[row2*size2+j]; }
maybe I'm wrong, but can you really use "sizeof(mat1)" in clSetKernelArg? does that return the size of the data in mat1 or the size of the whole mat1 mem object? maybe that's what causes segmentation faults..
in fact cl_mem and other cl_* objects are just pointers. but that setting kernel arguments seems correct. maybe just wait for 2.3 it can be bug n SDK.
Originally posted by: eklund.n maybe I'm wrong, but can you really use "sizeof(mat1)" in clSetKernelArg? does that return the size of the data in mat1 or the size of the whole mat1 mem object? maybe that's what causes segmentation faults..
I used this approach for the working part of the application, too.
Now there isn't the segmentation fault but I obtain unexpected results and since kernels for copies are very simple I believe I have to pay attention to something defining functions I reported.
As I wrote I solved segmentation faut, not all functions were implemented like I posted before, I forgot some little "&" characters
I solved all unexpected results including mem_fence(CLK_GLOBAL_MEM_FENCE); at the end of each kernel function such as "copyrow" I posted before but I inserted this only as an attempt and I don't know why now the program works correctly . This instruction was needed because of "if" in kernels or because of work-items start in functions instead of in main code?
I think beacuse of "if" but I thought that independently from execution path all work-items would have been waited by clFinish(queue).