cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Kariddi
Journeyman III

Compiling the "Hello World" tutorial using CL_DEVICE_TYPE_GPU

Hi, I'm learning OpenCL and I just viewed the introduction videos and compiled the Hello World demo. Everything is fine and dandy when I run the code with the default "CL_DEVICE_TYPE_CPU" set in the Context initialization, but if I change it to "CL_DEVICE_TYPE_GPU" then the output is different from what is expected.

 

I attached the entire code of the example .

The output with CL_DEVICE_TYPE_CPU is :

 

Platform number is: 1

Hello World

 

as expected.

The output with CL_DEVICE_TYPE_GPU is:

 

Platform number is: 1

l

 

And I don't understand why ...

 

Any guess?

 

Thanks
Marcello

EDIT: I forgot to say that I'm using an ATI Radeon 5850 GPU with Catalyst 10.3b drivers and ATI Stream 2.01 SDK

int main() { cl_int err; cl::vector< cl::Platform > platformList; cl::Platform::get(&platformList); checkErr(platformList.size()!=0 ? CL_SUCCESS : -1, "cl::Platform::get"); std::cerr << "Platform number is: " << platformList.size() << std::endl; cl::string platformVendor; platformList[0].getInfo(CL_PLATFORM_VENDOR, &platformVendor); //std::cerr << "Platform is by: " << platformVendor << "\n"; cl_context_properties cprops[3] = {CL_CONTEXT_PLATFORM, (cl_context_properties)(platformList[0])(), 0}; cl::Context context( CL_DEVICE_TYPE_GPU, cprops, NULL, NULL, &err); checkErr(err, "Conext::Context()"); char * outH = new char[hw.length()+1]; cl::Buffer outCL( context, CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR, hw.length()+1, outH, &err); checkErr(err, "Buffer::Buffer()"); cl::vector<cl::Device> devices; devices = context.getInfo<CL_CONTEXT_DEVICES>(); checkErr( devices.size() > 0 ? CL_SUCCESS : -1, "devices.size() > 0"); std::ifstream file("lesson1_kernels.cl"); checkErr(file.is_open() ? CL_SUCCESS:-1, "lesson1_kernel.cl"); std::string prog( std::istreambuf_iterator<char>(file), (std::istreambuf_iterator<char>())); cl::Program::Sources source( 1, std::make_pair(prog.c_str(), prog.length()+1)); cl::Program program(context, source); err = program.build(devices,""); checkErr(file.is_open() ? CL_SUCCESS : -1, "Program::build()"); cl::Kernel kernel(program, "hello", &err); checkErr(err, "Kernel::Kernel()"); err = kernel.setArg(0, outCL); checkErr(err, "Kernel::setArg()"); cl::CommandQueue queue(context, devices[0], 0, &err); checkErr(err, "CommandQueue::CommandQueue()"); cl::Event event; err = queue.enqueueNDRangeKernel( kernel, cl::NullRange, cl::NDRange(hw.length()+1), cl::NDRange(1, 1), NULL, &event); checkErr(err, "ComamndQueue::enqueueNDRangeKernel()"); event.wait(); err = queue.enqueueReadBuffer( outCL, CL_TRUE, 0, hw.length()+1, outH); checkErr(err, "ComamndQueue::enqueueReadBuffer()"); std::cout << outH; return EXIT_SUCCESS; return 0; } #pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable __constant char hw[] = "Hello World\n"; __kernel void hello(__global char * out) { size_t tid = get_global_id(0); out[tid] = hw[tid]; }

0 Likes
2 Replies
omkaranathan
Adept I

Marcello, 

The HelloWorld tutorial uses  cl_khr_byte_addressable_store  extension, which is not supported on GPU's currently.

This will be supported in upcoming release.

0 Likes

Ok, thank you for the fast answer

 

I managed to do the same thing by using an int array in the kernel instead of using a char array and copying 4 bytes at a time of the string.

0 Likes