cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rnickb
Journeyman III

passing SVM pointers in struct?

Can you pass SVM pointers to kernel in a struct by value?

I'm trying to do this, but I get a segmentation fault when I set the argument with AMD's drivers. It works with intel's opencl drivers, though.

I attached the program I'm experimenting with.

Any suggestions?

Thanks

0 Likes
1 Solution

I guess, I've found the problem of getting wrong result on GPU devices. We must pass the information about those SVM pointers to the kernel via clSetKernelExecInfo API. Because OpenCL spec says:


Coarse-grain or fine-grain buffer SVM pointers used by a kernel which are not passed as a kernel arguments must be specified using clSetKernelExecInfo with CL_KERNEL_EXEC_INFO_SVM_PTRS. For example, if SVM buffer A contains a pointer to another SVM buffer B, and the kernel dereferences that pointer, then a pointer to B must either be passed as an argument in the call to that kernel or it must be made available to the kernel using clSetKernelExecInfo. For example, we might pass extra SVM pointers as follows:




clSetKernelExecInfo(kernel, CL_KERNEL_EXEC_INFO_SVM_PTRS, num_ptrs * sizeof(void *), extra_svm_ptr_list);



Here num_ptrs specifies the number of additional SVM pointers while extra_svm_ptr_list specifies a pointer to memory containing those SVM pointers.



When calling clSetKernelExecInfo with CL_KERNEL_EXEC_INFO_SVM_PTRS to specify pointers to non-argument SVM buffers as extra arguments to a kernel, each of these pointers can be the SVM pointer returned by clSVMAlloc or can be a pointer + offset into the SVM region. It is sufficient to provide one pointer for each SVM buffer used.



When I added following lines of code before kernel call (for both version of kernels), I got the correct result on GPU devices.

void *svm_ptr_list[] = { A, B, C};

clSetKernelExecInfo(kernel, CL_KERNEL_EXEC_INFO_SVM_PTRS, 3 * sizeof(void *), svm_ptr_list);

Another point is, if you check the clinfo, the CPU is not detected as OpenCL2.0 device, only GPU is. So, the OpenCL2.0 features will not be supported on CPU device.

Regards,

View solution in original post

0 Likes
6 Replies
dipak
Big Boss


  Arguments args = {A, B, C};


  errcode = clSetKernelArg(kernel, 0, sizeof(Arguments), &args);



You can not pass host pointer to device in this way. If "args" is SVM memory, the pointer can be passed to kernel via clSetKernelArgSVMPointer. You may like to check this blog http://developer.amd.com/community/blog/2014/10/24/opencl-2-shared-virtual-memory/.

Regards,

0 Likes

Thanks for quick response Dipack.

I know you can could pass the Arguments structure by putting it in a SVM region, but I don't want the overhead associated with this approach.

Given that values of the the SVM are the same on host&device, I don't see why passing them by value like this in a struct shouldn't work. I also don't see anything in the OCL 2.0 spec that says this is disallowed.

In fact, I was able to get the code to work by changing the struct to capture the values as 64-bit integers and then casting them back to pointers (see attachment).

It seems like this is a bug in AMD's drivers. Why is it treating a structure with global pointers differently than it treats a corresponding struct with integer values instead of pointers?

thanks

0 Likes

Really good observation. The 2nd kernel code seems to work fine on CPU. However when I ran it on GPU, it ran successfully but gave me wrong result. I found similar observation (i.e. running successfully but giving wrong result) when I ran both the kernels on GPU.  That means, there was no segfault for 1st kernel on GPU. I did some more experiments and I've asked an expert regarding this. Meanwhile can you please run both the kernels for following combinations and share your observations with us.

1) Device:CPU + build flag: -cl-std=CL1.2,    2) Device:GPU + build flag: -cl-std=CL1.2,    3) Device:CPU + build flag: -cl-std=CL2.0,   4) Device:GPU + build flag: -cl-std=CL2.0

Regards,

0 Likes

I guess, I've found the problem of getting wrong result on GPU devices. We must pass the information about those SVM pointers to the kernel via clSetKernelExecInfo API. Because OpenCL spec says:


Coarse-grain or fine-grain buffer SVM pointers used by a kernel which are not passed as a kernel arguments must be specified using clSetKernelExecInfo with CL_KERNEL_EXEC_INFO_SVM_PTRS. For example, if SVM buffer A contains a pointer to another SVM buffer B, and the kernel dereferences that pointer, then a pointer to B must either be passed as an argument in the call to that kernel or it must be made available to the kernel using clSetKernelExecInfo. For example, we might pass extra SVM pointers as follows:




clSetKernelExecInfo(kernel, CL_KERNEL_EXEC_INFO_SVM_PTRS, num_ptrs * sizeof(void *), extra_svm_ptr_list);



Here num_ptrs specifies the number of additional SVM pointers while extra_svm_ptr_list specifies a pointer to memory containing those SVM pointers.



When calling clSetKernelExecInfo with CL_KERNEL_EXEC_INFO_SVM_PTRS to specify pointers to non-argument SVM buffers as extra arguments to a kernel, each of these pointers can be the SVM pointer returned by clSVMAlloc or can be a pointer + offset into the SVM region. It is sufficient to provide one pointer for each SVM buffer used.



When I added following lines of code before kernel call (for both version of kernels), I got the correct result on GPU devices.

void *svm_ptr_list[] = { A, B, C};

clSetKernelExecInfo(kernel, CL_KERNEL_EXEC_INFO_SVM_PTRS, 3 * sizeof(void *), svm_ptr_list);

Another point is, if you check the clinfo, the CPU is not detected as OpenCL2.0 device, only GPU is. So, the OpenCL2.0 features will not be supported on CPU device.

Regards,

0 Likes
rnickb
Journeyman III

I missed that point reading the spec. Thank you for figuring this out.

Does AMD's implementation not require you to specify the pointer if they're contained in a SVM region? seems like it must because the SVMBinarySearchTree example works without calling clSetKernelExecInfo on the pointers in each nodeStruct.

0 Likes

clSetKernelExecInfo API is only required when SVM buffer contains a pointer to another SVM buffer (i.e. other than the containing SVM buffer). It is not required when the pointer refers to a address within the same SVM buffer.

Regards,

0 Likes