cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

trinitrotoluene
Journeyman III

OpenCL sample AMD APP SDK v2.3 hang the system on Ubuntu-SOLVED.

 

The FluidSimulation2D and Mandelbrot OpenCL samples (under the x86_64 folder) hang the computer after 1 to 5 seconds of execution on the GPU. This is always reproductible. Execution on the CPU work fine. Those two samples work fine on the GPU again when I reinstall the previous AMD APP SDK V2.2. So I suspect a bug in the version 2.3 of the SDK. I have not tested other sample.

Since the FluidSimulation2D sample is only available with the version 2.3 of the SDK, I have saved the binary in another folder to be able to execute it with version 2.2. 

 

Here is the system configuration:

 

Linux AMD APP SDK v2.3 and v2.2 64 bits. 

 

Motherboard: Asus crosshair IV formula.

Processor: AMD Phenom II X6 1090T.

GPU: ATI Radeon HD5870 (Asus EAH5870/2DIS/1GD5/V2) with Catalyst 11.2 driver.  

 

 

Operating system: Ubuntu 10.10 64 bits. 

 

Kernel: Linux version 2.6.35-25-generic (buildd@crested) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #44-Ubuntu SMP Fri Jan 21 17:40:44 UTC 2011

 

 

fglrxinfo output:

 

display: :0.0  screen: 0

OpenGL vendor string: ATI Technologies Inc.

OpenGL renderer string: ATI Radeon HD 5800 Series

OpenGL version string: 4.1.10524 Compatibility Profile Context

 

 

I can give more information if needed upon request.

 

0 Likes
19 Replies
doodle
Journeyman III

Hey I had some problems with 2.3 also that I was trying to sort through. I solved my issues by replacing the header files for GL with my systems header files. That seemed to solve things for me. 

It might not help you per say but the examples you did have issue with if I remember are OpenGl examples.  

If you have any luck or more problems let me know. I will be trowling around for some time so we can figure this out together.

I am also 64-bit Ubuntu

0 Likes

I have found a counter example. The SimpleGL sample program work fine on the GPU with version 2.3 of the SDK.

I tried your suggestion but unfortunately the hang still happen only when the cl kernel execute on the GPU. But as you suggest, it seems that only some of the sample that use CL on the GPU and OpenGL at the same time are problematic.

So, since the SimpleGL sample work but the Mandelbrot and FluidSimulation2D don't work on the GPU, I will try to find what features of OpenCL don't work with the version 2.3 of the SDK. 

 

0 Likes

For now, I try to debug the FluidSimulation2D sample. The program hang after 1 second when executing clEnqueueWriteBuffer and clEnqueueReadBuffer inside the FluidSimulation2D::runCLKernels() functions. So I will write a simple program that will make heavy use of the clEnqueue* functions to see what happen.

Basicaly, the simple program will take an 2d array as input initialized with random data and perform simple arithmetic. Then take the result output as the input of the next execution of the kernel.   

 

0 Likes

A display hang while GPU executes some opencl command is expected. There is no actual hang of the system, only display is hanged as GPU is not able to refresh display while it is computing.

0 Likes

Originally posted by: himanshu.gautam A display hang while GPU executes some opencl command is expected. There is no actual hang of the system, only display is hanged as GPU is not able to refresh display while it is computing.

 

Yes, but what I am trying to find is why the two samples (FluidSimulation2D and Mandelbrot) work fine on the GPU and CPU with the version 2.2 of the SDK but hang the display only with the GPU when I install the version 2.3 of the SDK?

In short, SDK 2.3 with GPU = a hanged display, SDK 2.2 with GPU = sample work great with display in real time. There is a problem somewhere and I try to find it.

 

 

 

 

 

0 Likes

For me both examples(FluidSimulation2D and Mandelbrot OpenCL) work stably even simultaneously.

Operating system: Ubuntu 10.10 64 bits.
AMD APP SDK v2.3 64 bits.

GPU: ATI Radeon HD6950 with Catalyst 11.1 driver.

0 Likes

Originally posted by: ED1980 For me both examples(FluidSimulation2D and Mandelbrot OpenCL) work stably even simultaneously. Operating system: Ubuntu 10.10 64 bits. AMD APP SDK v2.3 64 bits.

 

GPU: ATI Radeon HD6950 with Catalyst 11.1 driver.

 

Your gpu is different and the gfx driver version is different. It is possible that the problem affect only HD5800 class hardware? I will try to reinstall both OpenCL SDK and the graphic drivers to see what happen. 

But now I know that the 2.3 SDK work with the newest gpu. Thanks.

 

 

0 Likes

I do not see any display hang with my HD 5770 card with our internal SDK on my vista machine. Do you use some discrete command line options? I will check with ubuntu but as ED1980 reported it should not be a problem.

0 Likes

Originally posted by: himanshu.gautam I do not see any display hang with my HD 5770 card with our internal SDK on my vista machine. Do you use some discrete command line options? I will check with ubuntu but as ED1980 reported it should not be a problem.

 

The only command line option I use is for device selection --device cpu or gpu. I follow carefully the instruction of removall and installation of the new SDK. But since it work well with the version 2.2, maybe some old file are still around. If you check with Ubuntu 10.10, can you tell me the catalyst driver version you are using. Thanks. 

 

0 Likes

I have made a simple program that reproduce the hang on the GPU with the version 2.3 of the SDK. The simple program run fine on the GPU with the version 2.2 of the SDK with 23% of GPU usage (aticonfig --odgc). This time, test were done with the catalyst 10.12 version.

If the 2D array is small (8x8), the program don't hang the computer. But with a 256x256 array size, computer hang occur.

 

 

 

 

//C++ source code #include <iostream> #include <fstream> #include <cstdlib> #include <string> #include <vector> #include <CL/cl.hpp> using std::cout; using std::endl; //CONSTANT DEFINITIONS const cl_device_type DEVICE_TYPE = CL_DEVICE_TYPE_GPU; const int WIDTH = 256; const int HEIGHT = 256; const int NUM_ITERATIONS = 100000; void readStringInFile(const char *filename,std::string &source); int main(int argc,char **argv) { int i,j; std::vector<cl::Platform> platform; std::vector<cl::Device> device; cl::Platform::get(&platform); cl_int err; platform[0].getDevices(DEVICE_TYPE,&device); cl_context_properties context_properties[] = {CL_CONTEXT_PLATFORM, (cl_context_properties)(platform[0])(),0}; cl::Context context(device,context_properties,NULL,NULL,NULL); std::string source_str; readStringInFile("kernel.cl",source_str); cl::Program::Sources source(1, std::make_pair(source_str.c_str(), source_str.length())); cl::Program program = cl::Program(context,source); program.build(device); std::string build_info(""); program.getBuildInfo(device[0],CL_PROGRAM_BUILD_LOG,&build_info); if(build_info.length() > 0) { cout<<"Build log: "<<build_info<<endl; } cl::Kernel kernel(program,"add_array2D",&err); if(err != CL_SUCCESS) { cout<<"Kernel Error"<<endl; } cl::Event event; cl::CommandQueue queue(context,device[0],0,&err); cl_int *in_data = new cl_int[WIDTH*HEIGHT]; cl_int *out_data = new cl_int[WIDTH*HEIGHT]; cl::Buffer input(context, CL_MEM_USE_HOST_PTR|CL_MEM_READ_WRITE, WIDTH*HEIGHT*sizeof(cl_int), in_data, &err); cl::Buffer output(context, CL_MEM_USE_HOST_PTR|CL_MEM_READ_WRITE, WIDTH*HEIGHT*sizeof(cl_int), out_data, &err); for(i = 0; i < WIDTH*HEIGHT ; i++) { in_data = 0; } kernel.setArg(0,input); kernel.setArg(1,output); kernel.setArg(2,WIDTH); kernel.setArg(3,HEIGHT); for(i = 0 ; i < NUM_ITERATIONS ; i++) { err = queue.enqueueWriteBuffer(input, 1, 0, WIDTH*HEIGHT*sizeof(cl_int), in_data); err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(WIDTH,HEIGHT), cl::NullRange, NULL, &event); if(err != CL_SUCCESS) { cout<<"NDRange kernel error"<<endl; } event.wait(); queue.enqueueReadBuffer(output,1,0,WIDTH*HEIGHT*sizeof(cl_int),out_data); memcpy(in_data,out_data,WIDTH*HEIGHT*sizeof(cl_int)); } for(i = 0 ; i < WIDTH*HEIGHT ; i++) { cout<<out_data<<endl; } delete[] in_data; delete[] out_data; return (EXIT_SUCCESS); } void readStringInFile(const char *filename,std::string &source) { std::ifstream read_file; int length; read_file.open(filename,std::ifstream::in); if(!read_file.fail()) { read_file.seekg(0,std::ios::end); length = read_file.tellg(); read_file.seekg(0,std::ios::beg); source.resize(length); read_file.read(&source[0],length); read_file.close(); } } //CL source code __kernel void add_array2D(__global int *in_array,__global int *out_array, int width,int height) { const int i = get_global_id(0); const int j = get_global_id(1); //in_array[i+j*width] = out_array[i+j*width]+1; out_array[i+j*width] = in_array[i+j*width] + 1; }

0 Likes

There is no hang with internal implementations. The reason i guess is that there are a lot of kernels enqueued together but each one waits for previous one and so display can be refreshed after each kernel. There should be a hang if you run a single very large kernel.

0 Likes

Thanks for your reply. I think I will stay with the version 2.2 of the SDK because it work very well with all OpenCL program with no hang with my computer. I hope the next AMD APP SDK will work well again with my system. 

When you write about internal implementations. Is it the same version of the SDK v2.3 that I have downloaded on the AMD web site and tested on my system, or you have a special developper build provided by AMD?    

 

0 Likes

I refer to internal test packages.

With hang I mean a display hang which happens because GPU is not able to refresh the GUI when it is stuck executing a long kernel. The GPU can even restart itself after a watchdog timer expires. See Developer Notes for details.This hang is expected. 

Can you please do the following tests. Try running it for less iterations(1000).

Also try to run either enqueueReadBuffer or the enqueuendrangekernel, one at a time. Do you still see the crash?

 

0 Likes

My small test program hang the display after ~1500 iterations with an array of 256x256 with the version 2.3 of the SDK. So with less iterations it works fine.

I will run my program and the FluidSimulation2D in debug mode with GDB so the enqueueReadBuffer will run one at a time. Then I will run those program in release mode and wait a couple of minutes to see if the system is able to recover after the GPU is unable to respond.

Thank you very much for your support. 

 

 

 

0 Likes

I have tested the FluidSimulation2D sample under the GDB debugger and the GUI hang still occur. Then for my small test program I add the sleep(1) instruction after each enqueue instruction and also the gui hang still occur. Unfortunately, under linux, when a deadlock occur on the GPU the only way to recover is to reboot.

For now my hypothesis is that there is maybe a problem with dma transfer between the memory of the host (CPU) and the compute device (GPU) because the SDK v2.3 enabled drmdma for this release. Furthermore, The SimpleGL work fine because the memory stay on the compute device.

For now I will try to modify my simple program to use only memory on the GPU if possible between each kernel invocation to see what happen. 

 

 

0 Likes

After modifying my simple program I found that the problem is the frequent memory transfert between the CPU and the GPU. The modified program run fine now with the version 2.3 of the SDK with 75% of GPU usage.

So with the version 2.3 SDK I must avoid frequent memory transfert between the CPU and GPU like the FluidSimulation2D sample program seems to do. 

 

//Modified C++ code err = queue.enqueueWriteBuffer(input, CL_TRUE, 0, WIDTH*HEIGHT*sizeof(cl_int), in_data); for(i = 0 ; i < NUM_ITERATIONS ; i++) { /* err = queue.enqueueWriteBuffer(input, CL_TRUE, 0, WIDTH*HEIGHT*sizeof(cl_int), in_data); */ // sleep(1); err = queue.enqueueNDRangeKernel(kernel, cl::NullRange, cl::NDRange(WIDTH,HEIGHT), cl::NullRange, NULL, &event); if(err != CL_SUCCESS) { cout<<"NDRange kernel error"<<endl; } event.wait(); // sleep(1); /* queue.enqueueReadBuffer(output,CL_TRUE,0,WIDTH*HEIGHT*sizeof(cl_int),out_data); // sleep(1); memcpy(in_data,out_data,WIDTH*HEIGHT*sizeof(cl_int)); cout<<i<<endl; */ } queue.enqueueReadBuffer(output,CL_TRUE,0,WIDTH*HEIGHT*sizeof(cl_int),out_data); for(i = 0 ; i < WIDTH*HEIGHT ; i++) { cout<<out_data<<endl; } //Modified CL kernel __kernel void add_array2D(__global int *in_array,__global int *out_array, int width,int height) { const int i = get_global_id(0); const int j = get_global_id(1); //in_array[i+j*width] = out_array[i+j*width]+1; out_array[i+j*width] = in_array[i+j*width] + 1; in_array[i+j*width] = out_array[i+j*width]; }

0 Likes

Thank MicahVillmow for the reply. I will then use the version 2.2 of the SDK until the next release. 

0 Likes

Just to make a follow up. I finally found why my system hang. After upgrading the bios of my motherboard, all my settings was set to default. Then I tried the OpenCL samples that previously hang my system with the version 2.3 of the SDK.They work fine with this version now.

The culprit was the ganged/unganged memory setting in the bios. Setting the memory to ganged mode make all OpenCL samples that use alot of bandwidth (FluidSimulation and PCIeBandWidth) between the cpu and the gpu hang my system. So I must leave that to unganged.  

This problem was hard to find because the hang was only happening with the version 2.3 of the SDK.  I thought that the problem was the software but in fact, it was my bios setting. I must apologize for reporting this but in the process I learn that I must debug my app in a console window.

Many thanks to all peoples that have helped me. 

 

 

0 Likes

trinitrotoluene,
This is a known issue, the current workaround is to debug in a console window and not in X.
0 Likes