cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

spectral
Adept II

ATI Stream SDK v2.2.01 ?

Hi,

The ATI Team is doing a great job with the SDK because they add a lot of features, improve performance and so on...

I have see the 'roadmap' and they tell to have a SDK 2.3 for the end of the year ! It is, of course, a good news because there are some very interesting language features, but...

The problem is that we have no "stable" opencl platform... and so, we are unable to 'release' any commercial software based on OpenCL.

It is a lot of work to have something that work correctly on every kind of CPU and GPU, but maybe it is a priority before adding new features that will "break" the current version of the SDK and introduce new side effects (bugs !?).

So, I would like to have the advice from the ATI team and other peoples using OpenCL.

I think it is important for the success of this project to allow "commercial" products to be released. By example, our competitors use CUDA that work very well.

Thanks

0 Likes
26 Replies
LeeHowes
Staff

I'm not sure I understand. Obviously it is optimal for a software development process to not break previous code when a new version is released. This is not always possible if previous code was based on a workaround for a bug of some sort in the earlier version.

Clearly you wouldn't prefer that upgrades not happen: there are still missing features and performance improvements to make, after all. CUDA is much later in its development process and early CUDA versions were pretty feature-limited in comparison to recent ones. OpenCL is rushing quite a number of those late features in in one go. I think 2.3 will offer you some very nice improvements and you'd rather have it than not have it.

Are there specific new features that have broken old code that have concerned you and specifically stopped you releasing commercial software?

0 Likes

Originally posted by: LeeHowes

 

 

Are there specific new features that have broken old code that have concerned you and specifically stopped you releasing commercial software?

 

 

1. SKA tool only shows N/A or crashes when you copy-paste the kernel.

2. The ATI Profiler cannot profile DLLs ( and I cannot even install it because it breaks VS and takes forever to install ).Think that most of GPGPU programs are DLL plugins for 3dsmax, Photoshop, Nuke or GIMP ... On the other hand x64 apps are not supported...

 ( point #1 + #2 result in == bad app optimization )

 

3. The JIT compiler's window pops each time you call clBuildProgram()

4. No DMA support nor concurrent kernels. This may affect the speed a lot, specially if you execute lots of kernels or you move a lot of data using the PCI.

5. No image support for CPU.

6. Radeon 4XXX have serious limitations in OpenCL. I know, it's a HW thing but it's a problem because a lot of people still use these Radeons.

7. Software memory limitations ( yes, the infamous GPU_MAX_HEAP_SIZE )

8. Catalyst 10.10 APP broke a lot of apps ( I think you have a bug managing multiple events ). On the other hand I really cannot understand why you let the users to choose between a non-OpenCL driver and an OpenCL one... that will confuse the users and your app won't run if they installed the non APP one. OpenCL should be not optional as is not OpenGL and neither DX.

9. Lack of OpenCL compiler options ( -cl-relaxed-math, pragma unroll, etc... . Well, I'm not sure if this will affect too much the performance but I would like to have them available, pls.

 

10. Lack of a true GPU debugger ( as NVIDIA Nsight ). If you could do something like NSight will be fantastic.

 

 

0 Likes

Most of those are bugs, though, that you would expect to see fixed in newer versions (some of which are actually fixed). Not issues that have previously worked and since been broken by newer versions.

As for progress, think how long it took nvidia before they produced Nsight. For a long time a basic profiler was all they had available. Tools have evolved with time. Radeon 4xxx has severe limitations because it is not hardware designed to run OpenCL.

Your Catalyst 10.10 comment is the only one that directly seems to relate to the OP. I don't know what specific set of things that broke, though. I just wanted a better idea of what changes had actually made OpenCL applications work *less well* with time. We all know there are bugs and limitations of the SDK. This should be expected while people like Micah are working very hard on improving it with time.

0 Likes

could we expect to see a better profiler for opencl, something like nvidia has?

0 Likes

Hi,

Sure, the OpenCL Team is doing a great job, really !! At each release the SDK is better, faster, has new features, ... no doubt you work very hard and that you have great results.

I just want to say that it is more important for us that you fix the bugs before adding new features, in order to have a stable software.

Of course, I'm also exited by the new features you add, but it is not my priority. Bug fix are more important.

Thanks

0 Likes

As for me multi GPU issues and DMA are the most important, because i use ATI gpu based cluster for scientific research. And now i have big penalty for using more then one ati gpu on one motherboard.

0 Likes

1. SKA tool only shows N/A or crashes when you copy-paste the kernel.


Please send the kernel that causes crashing to gputools.support@amd.com.  We are not aware of a valid kernel that can cause the tool to crash.  We do aware some kernels may generate N/A; we are working to improve this.

 

2. The ATI Profiler cannot profile DLLs ( and I cannot even install it because it breaks VS and takes forever to install ).Think that most of GPGPU programs are DLL plugins for 3dsmax, Photoshop, Nuke or GIMP ... On the other hand x64 apps are not supported...

 

 ( point #1 + #2 result in == bad app optimization )



 

You should profile the DLL with the application program (3dsmasx, photoshop, nuke, gimp, etc) or a test program.  I am not sure what it means to profile a DLL without the application program.  To profile the DLL with the application, start it with the command line mode of the profiler.

We will improve the install time in the next version.  Also, profiling x64 apps will be supported in the next version which will be released in the next several days (check out the profiler's webpage soon).

could we expect to see a better profiler for opencl, something like nvidia has?


I hope so   If there are features that you'd like to see in the profiler, please send them to the email above (or post them in the GPU Developer Tools forum).

0 Likes

viewon01,
We try to fix all known bugs in a release before shipping it. The issues we aren't able to fix, we try to get them all documented them in the release notes. If we are breaking something important to your application, please feel free to send us a test case or let us know about it so we can fix it for the next release.
0 Likes

I can't agree with you, a lot of bugs didn't notice in release notice. For example multigpu issue or problem with DMA. Let's see

////--------------from release notice 10.9 driver

Known Issues
The following section provides a brief description of known issues associated with the
latest version of ATI Catalyst™ Linux software suite. These issues include:
?? Killing X-server after resuming from hibernation may cause the system to stop
responding
?? System may fail to display error message when improper position values are used in
"aticonfig --tv-geometry" resulting in invalid "TVHPosAdj" and "TVVPosAdj"
values in xorg.conf file
?? Rotated screen may fail to properly restore after running full-screen OpenGL
application in dual-head mode; a switch VT will restore screen correctly
?? Mouse cursor might be blocked from entering the taskbar area after applying specific
rotations
?? Significant delay may be observed while rotating screen with XRandR
?? Segmentation fault may occur when running "Quake 4" and "Enemy Territory:
Quake Wars" at 1280x1024 or resolutions higher than current desktop resolution

///----

No one OpenCL issue noticed.....

and in ATI_Stream_SDK_Release_Notes_Developer.pdf don't be mentioned about multigpu penalty, DMA or –cl-fast-relaxed-math or -cl-mad-enable, etc... so on.

0 Likes

zeland,
Please refer to the SDK release notes, not the catalyst release notes.
0 Likes

I 've updated my previous  post.

0 Likes

From the dev release notes on the multi-gpu issue.
"The ATI Radeontm HD 5970 GPU is currently supported in single-GPU mode only. It is
recommended users only access the first device on an ATI Radeontm HD 5970 GPU for GPU
compute."
For DMA and the compiler options, I believe we have not stated that we support those features yet, I could be wrong. They are features that are not implemented, not bugs in our implementation. We are working on getting them implemented in a future release, so they will be coming.
0 Likes

I am saying not only about 5970 but about using more then 1 ati GPU on one motherboard. For example I run two instance of my OpenCL program on two different ATI GPU(5870 under linux), and i have penalty for it from 1.5-1.7(10.10) to more then 2 times(on some older drivers). Is it a bug?

And on my opinion -cl-fast-relaxed-math is a bug, because it's work on NVIDIA implementation of OpenCL, and works pretty good(-cl-relaxed math is about 2 time speeder in my program) And it is included to OpenCL specification.

But How to name problem is not so important, I hope that these and some another problems will be noticed in "Known issue".

0 Likes

Originally posted by: MicahVillmow 

 For DMA and the compiler options, I believe we have not stated that we support those features yet, I could be wrong.



http://developer.amd.com/gpu/ATIStreamSDK/Pages/default.aspx

What’s New in v2.2
Support for OpenCL™ 1.1 specification.3
Please see the OpenCL™ 1.1 specification for more information about this feature.

0 Likes

We support the specification and the compiler accepts the options as required by the specification. When the option does anything in our implementation, it will be announced in the release notes.
0 Likes

Please just noticed it in "Known issues" unless it will be fixed. It is not fare for new-beginners Search throw all forums just to find what is really work.

0 Likes

Zeland,
I've requested that it get added to the release notes for 'Known Issues' for SDK 2.3.
0 Likes

So i can't expect that it will be fixed unless 2.4?

0 Likes

I can't give estimates on when it will get implemented, but this feature was not implemented for our upcoming 2.3 release.
0 Likes

what about sdk 2.2 release notes? And what about other "problem"? Will they be mentioned?

0 Likes

I think it will be useful if all known issues will be collected in one place.

0 Likes

Yes, a bug tracking tools where external developers (us) have acces 😛

0 Likes

Have an idea for the next ATI Stream SDK. What about having the clc compiler optimize the rotate() function to use bitalign? This would be very useful. Right now I am writing a SHA-1 hashing kernel which depends heavily on cyclic rotate operations. As there is no compile-time way to check if amd media ops are supported (e.g something like #ifdef BITALIGN_SUPPORTED), I have to write two separate kernels and use one of them depending on the device info returned by clGetDeviceInfo.

Not sure if that wasn't discussed before though.

0 Likes

you can use

#ifdef cl_amd_media_ops

//code using media ops from AMD

#else

//generic code

#endif

every supported extension is defined.

0 Likes

Thanks for this, I didn't know about it, that will help a lot (at least no more bitalign/4xxx kernels).

Yet, I think the best would be to optimize the rotate() function in clc. This would also eliminate the preprocessor directives which make the code look ugly and bloated.

 

0 Likes

gat3way,
There have been improvements in codegen to optimize for bitalign in the upcoming release. If your particular case is missed, please let me know with a test case and I'll work on getting the optimizer to recognize it.
0 Likes