cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

spectral
Adept II

Computer is dead/locked : last SDK !

Hi,

I have try to run the clpp gpu radix sort on a HD6950 and after 1 sort, it locks !

To test, you can simply download the last version from SVN (not the packaged one), run it... and you will have to reboot your computer !

It is a CRITICAL problem... so please do your best to fix it !

Thanks

0 Likes
18 Replies
spectral
Adept II

Does someone from the AMD Team can test it ?

My computer is locked each time I start the sort algorithm ! (I know that I'm not the only one to meet this kind of problem !!!)

You can download the last version from SVN here :

http://code.google.com/p/clpp/

0 Likes

viewon01,
Does the hang occur if all the barriers are commented out? Also, can you set GPU_BARRIER_DETECTION=true to see if that fixes the hang?
0 Likes

Thanks Micah,

I will try, but what and where/how can I set GPU_BARRIER_DETECTION=true ?

0 Likes

it is an environment variable.
On windows, use 'set GPU_BARRIER_DETECTION=true' and then run your app from the command line.
On linux, use 'export GPU_BARRIER_DETECTION=true'
0 Likes

Hi Micah,

I have do some tests :

1) I only test the 'scan' algorithm and the computer is blocked.

2) I have put the env var. GPU_BARRIER_DETECTION but it has NO effect

3) I have remove all the barriers and this has the computer DOES NOT block. So, the problem is related to a local-barrier

What can I do to help you to find the problem ? Maybe someone at AMD can take a closer look ? Tell me what to do ?

Thanks

 

BTW: I have also update to Catalyst 11.6 but it change nothing 😞

0 Likes

I have also tested with the Catalyst 11.7 preview, the one delivered with gDebugger.

First, I still have the problem and even there are some regressions !!!!

I have test my main software with it (This one does not contains any _sync_ operation) and it also block the computer !

There is a terrible bug somewhere 😞

Tell me how I can help you to debug this ?

0 Likes

I have see that other peoples also have their computer freezing :

http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=150237&enterthread=y

And even, I have test my application with Catalyst 11.7 and now even the CPU version crash... I run it and it exit directly !!!!!!!

0 Likes

vewon01,

I am looking into this issue.I will try to communicate if i find some thing fishy inyour code.

Anyways as you said before, the code doesn't crash when you removed the barriers.I would suggest you to incrementely remove barriers one by one and try to figure out which barrier gets stuck.

Also have you tried to use gDEBugger, it can be helpful.

0 Likes

Thanks,

So I'll wait for your feedback. Thanks for your help.

0 Likes

Hi Himanshu,

Have you find something ? Maybe the problem is somewhere else ?

0 Likes

Still no news ?

I have try to find the bug, but maybe I have miss a specific architecture constraint (Like the one that Micah is talking about). Or it is a driver bug !

Can you help ?

Thanks

0 Likes

Any news ?

How can I help to debug please ?

Thanks

0 Likes

Originally posted by: viewon01 Any news ?

 

How can I help to debug please ?

 

Thanks

 

Viewon01,

It is look like they have not followed OpenCL.  I am getting following error from clBuildProgram. I have tested with SDK2.5 and 11.7 driver.

 

==============================================

Platform[AMD Accelerated Parallel Processing] Device[Cypress]

 

 

--------------- Satish radix sort Key

Error: Failed to build program executable!

C:\Users\Naganna\AppData\Local\Temp\OCL850A.tmp.cl(100): error: non-kernel

          function: variable with automatic storage duration cannot be stored

          in a named address space

        __local uint localBuffer[TPG*2];

                     ^

 

C:\Users\Naganna\AppData\Local\Temp\OCL850A.tmp.cl(104): error: identifier

          "localBuffer" is undefined

        uint4 localBits = inclusive_scan_128(localBuffer, tid, block, lane, init

ialValue, bitsOnCount);

                                             ^

 

C:\Users\Naganna\AppData\Local\Temp\OCL850A.tmp.cl(128): error: mixed

          vector-scalar operation not allowed unless

          up-convertable(scalar-type=>vector-element-type)

        localBits += localBuffer[block + 4 - 1];

                     ^

 

C:\Users\Naganna\AppData\Local\Temp\OCL850A.tmp.cl(308): warning: type

          qualifier is meaningless on cast type

      const int4 tid4 = ((const int4)tid) + (const int4)(0,WGZ,WGZ_x2,WGZ_x3);

 

                          ^

 

C:\Users\Naganna\AppData\Local\Temp\OCL850A.tmp.cl(309): warning: type

          qualifier is meaningless on cast type

        const int4 gid4 = tid4 + ((const int4)groupId<<2);

                                   ^

 

3 errors detected in the compilation of "C:\Users\Naganna\AppData\Local\Temp\OCL

850A.tmp.cl".

≡¡Internal error: compiler frontend invocation failed. Make sure ATISTREAMSDKROO

T is set

Program build failure

Assertion failed: clStatus == CL_SUCCESS, file c:\users\naganna\downloads\clpp_v

1_beta3\clpp\src\clpp\clppprogram.cpp, line 180



==============================================

 

Please ask clpp developers to fix all OpenCL compilation error for AMD OpenCL.

 

0 Likes

Hi,

The fix have been dones. Can you help to fix for the GPU please ?

Thanks

0 Likes

 

It looks like the latest drivers, Catalyst 11.9 fixes the problem.

Can you confirm?

0 Likes

Originally posted by: erwincoumans  

It looks like the latest drivers, Catalyst 11.9 fixes the problem.

Can you confirm?

I still see those errors with internel libraries.  Please ask clpp developer to fix the issue.

Error log

==============================================

Platform[AMD Accelerated Parallel Processing] Device[Cypress]

 

 

--------------- Satish radix sort Key

Error: Failed to build program executable!

C:\Users\Naganna\AppData\Local\Temp\OCLC91D.tmp.cl(100): error: variable with

          automatic storage duration cannot be stored in the named address

          space

        __local uint localBuffer[TPG*2];

                     ^

 

C:\Users\Naganna\AppData\Local\Temp\OCLC91D.tmp.cl(104): error: identifier

          "localBuffer" is undefined

        uint4 localBits = inclusive_scan_128(localBuffer, tid, block, lane, initialValue, bitsOnCount);

                                             ^

 

C:\Users\Naganna\AppData\Local\Temp\OCLC91D.tmp.cl(128): error: mixed

          vector-scalar operation not allowed unless

          up-convertable(scalar-type=>vector-element-type)

        localBits += localBuffer[block + 4 - 1];

                     ^

 

C:\Users\Naganna\AppData\Local\Temp\OCLC91D.tmp.cl(308): warning: type

          qualifier is meaningless on cast type

      const int4 tid4 = ((const int4)tid) + (const int4)(0,WGZ,WGZ_x2,WGZ_x3);

                          ^

 

C:\Users\Naganna\AppData\Local\Temp\OCLC91D.tmp.cl(309): warning: type

          qualifier is meaningless on cast type

        const int4 gid4 = tid4 + ((const int4)groupId<<2);

                                   ^

 

3 errors detected in the compilation of "C:\Users\Naganna\AppData\Local\Temp\OCLC91D.tmp.cl".

 

Internal error: clc compiler invocation failed.

 

Program build failure

Assertion failed: clStatus == CL_SUCCESS, file c:\users\naganna\desktop\clpp_v1_beta3\clpp\src\clpp\clppprogram.cpp, line 180

 

 



0 Likes

Has anyone got this problem solved yet? What a pity that the very helpful cpll can't be used on AMD GPUs!

0 Likes

To be fair it looks like this should not compile on any OpenCL hardware as it breaks the OpenCL spec. Havent read it yet but you shouldn't be able to declare __local memory in a non-kernel function as this would cause all sorts of confusion.

0 Likes