Hi, I'm having a problem running some simple code on my amd gpu. It runs fine on my intel i5.
The problem seems to be just before I create the kernel, after the buffers have been created successfully.
I get the runtime error: Unhandled exception at 0x0f935187 in 3D Camera.exe: 0xC0000005: Access violation reading location 0x00000000.
Can anyone help? Seems there are certain rules I should follow on amd gpu's that don't apply to CPUs.
Also, I'm using C++. My code is shown below
using namespace cv;
using namespace std;
int main() {
const std::string hw("Hello World\n");
cl_int err;
//get all platforms (drivers)
//CHOOSING PLATFORM ON COMPUTER
std::vector<cl::Platform> all_platforms;
cl::Platform::get(&all_platforms);
if(all_platforms.size()==0){
std::cout<<" No platforms found. Check OpenCL installation!\n";
exit(1);
}
int platform;
cout<<"Available Platforms: "<<endl;
for (int i=0;i<all_platforms.size();i++) {
cout<<i<<". "<<all_platforms.getInfo<CL_PLATFORM_NAME>()<<"\n";
}
cout<<endl<<"Choose a Platform number: ";
cin>>platform;
cl::Platform default_platform=all_platforms[platform]; //Choosing a platform
cout<<endl<<"Using: "<<default_platform.getInfo<CL_PLATFORM_NAME>()<<endl;
//CHOOSING DEVICE ON PLATFORM
std::vector<cl::Device> all_devices;
default_platform.getDevices(CL_DEVICE_TYPE_ALL, &all_devices);
if(all_devices.size()==0){
std::cout<<" No devices found. Check OpenCL installation!\n";
exit(1);
}
cout<<endl<<"Available devices on chosen Platform: "<<endl;
for (int i=0;i<all_devices.size();i++) {
cout<<i<<". "<<all_devices.getInfo<CL_DEVICE_NAME>()<<"\n";
}
int device;
cout<<endl<<"Choose a device number: ";
cin>>device;
cl::Device default_device=all_devices[device];
std::cout<< "Using device: "<<default_device.getInfo<CL_DEVICE_NAME>()<<"\n";
std::string kernel_code=
" void kernel simple_add(global const int* A, global const int* B, global int* C){ C[get_global_id(0)]=A[get_global_id(0)]+B[get_global_id(0)];} ";
cl::Context context(default_device);
cl::Program::Sources sources(1,std::make_pair(kernel_code.c_str(), kernel_code.length())); // +1
cl::Program program(context,sources);
err = program.build(all_devices);
cout<<"Program built. "<<endl;
// create buffers on the device
cl::Buffer buffer_A(context,CL_MEM_READ_WRITE,sizeof(int)*10);
cl::Buffer buffer_B(context,CL_MEM_READ_WRITE,sizeof(int)*10);
cl::Buffer buffer_C(context,CL_MEM_READ_WRITE,sizeof(int)*10);
int A[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
int B[] = {0, 1, 2, 0, 1, 2, 0, 1, 2, 0};
//create queue to which we will push commands for the device.
cl::CommandQueue queue(context,default_device);
//write arrays A and B to the device
queue.enqueueWriteBuffer(buffer_A,CL_TRUE,0,sizeof(int)*10,A);
queue.enqueueWriteBuffer(buffer_B,CL_TRUE,0,sizeof(int)*10,B);
cout<<"Buffers enqueued. "<<endl;
//alternative way to run the kernel
cl::Kernel kernel_add = cl::Kernel(program,"simple_add");
cout<<"Kernel created. "<<endl;
kernel_add.setArg(0,buffer_A);
kernel_add.setArg(1,buffer_B);
kernel_add.setArg(2,buffer_C);
cout<<"Args set. "<<endl;
queue.enqueueNDRangeKernel(kernel_add,cl::NullRange,cl::NDRange(10),cl::NullRange);
queue.finish();
int C[10];
//read result C from the device to array C
queue.enqueueReadBuffer(buffer_C,CL_TRUE,0,sizeof(int)*10,C);
std::cout<<"result: "<<endl;
for(int i=0;i<10;i++){
std::cout<<C<<" ";
}
return 0;
}
Solved! Go to Solution.
Ah Thanks! I figured the AMD platform supports Intel devices, while Intel doesn't support AMD devices. So using the AMD platform gave me, as you said, a device not included in the context. To solve that, I changed line 32 to:
default_platform.getDevices(CL_DEVICE_TYPE_GPU, &all_devices);
This makes sure only my AMD GPU is included and can be built.
The short answer is, whatever is in the context creation should be in the program build.
check for all potential OpenCL errors. and which line it does crash?
Line 77, just when the kernel is created, it crashes and gives the error
the problem is that you are trying build program for devices which are not included in context. you should get CL_INVALID_DEVICE error from program.build() call. the crash later on is because you didn't catch this error first.
Ah Thanks! I figured the AMD platform supports Intel devices, while Intel doesn't support AMD devices. So using the AMD platform gave me, as you said, a device not included in the context. To solve that, I changed line 32 to:
default_platform.getDevices(CL_DEVICE_TYPE_GPU, &all_devices);
This makes sure only my AMD GPU is included and can be built.
The short answer is, whatever is in the context creation should be in the program build.
you can just call program.build() and it will build it for all devices which are in context.