Hi,
The ATI Team is doing a great job with the SDK because they add a lot of features, improve performance and so on...
I have see the 'roadmap' and they tell to have a SDK 2.3 for the end of the year ! It is, of course, a good news because there are some very interesting language features, but...
The problem is that we have no "stable" opencl platform... and so, we are unable to 'release' any commercial software based on OpenCL.
It is a lot of work to have something that work correctly on every kind of CPU and GPU, but maybe it is a priority before adding new features that will "break" the current version of the SDK and introduce new side effects (bugs !?).
So, I would like to have the advice from the ATI team and other peoples using OpenCL.
I think it is important for the success of this project to allow "commercial" products to be released. By example, our competitors use CUDA that work very well.
Thanks
I'm not sure I understand. Obviously it is optimal for a software development process to not break previous code when a new version is released. This is not always possible if previous code was based on a workaround for a bug of some sort in the earlier version.
Clearly you wouldn't prefer that upgrades not happen: there are still missing features and performance improvements to make, after all. CUDA is much later in its development process and early CUDA versions were pretty feature-limited in comparison to recent ones. OpenCL is rushing quite a number of those late features in in one go. I think 2.3 will offer you some very nice improvements and you'd rather have it than not have it.
Are there specific new features that have broken old code that have concerned you and specifically stopped you releasing commercial software?
Originally posted by: LeeHowes
Are there specific new features that have broken old code that have concerned you and specifically stopped you releasing commercial software?
1. SKA tool only shows N/A or crashes when you copy-paste the kernel.
2. The ATI Profiler cannot profile DLLs ( and I cannot even install it because it breaks VS and takes forever to install ).Think that most of GPGPU programs are DLL plugins for 3dsmax, Photoshop, Nuke or GIMP ... On the other hand x64 apps are not supported...
( point #1 + #2 result in == bad app optimization )
3. The JIT compiler's window pops each time you call clBuildProgram()
4. No DMA support nor concurrent kernels. This may affect the speed a lot, specially if you execute lots of kernels or you move a lot of data using the PCI.
5. No image support for CPU.
6. Radeon 4XXX have serious limitations in OpenCL. I know, it's a HW thing but it's a problem because a lot of people still use these Radeons.
7. Software memory limitations ( yes, the infamous GPU_MAX_HEAP_SIZE )
8. Catalyst 10.10 APP broke a lot of apps ( I think you have a bug managing multiple events ). On the other hand I really cannot understand why you let the users to choose between a non-OpenCL driver and an OpenCL one... that will confuse the users and your app won't run if they installed the non APP one. OpenCL should be not optional as is not OpenGL and neither DX.
9. Lack of OpenCL compiler options ( -cl-relaxed-math, pragma unroll, etc... . Well, I'm not sure if this will affect too much the performance but I would like to have them available, pls.
10. Lack of a true GPU debugger ( as NVIDIA Nsight ). If you could do something like NSight will be fantastic.
Most of those are bugs, though, that you would expect to see fixed in newer versions (some of which are actually fixed). Not issues that have previously worked and since been broken by newer versions.
As for progress, think how long it took nvidia before they produced Nsight. For a long time a basic profiler was all they had available. Tools have evolved with time. Radeon 4xxx has severe limitations because it is not hardware designed to run OpenCL.
Your Catalyst 10.10 comment is the only one that directly seems to relate to the OP. I don't know what specific set of things that broke, though. I just wanted a better idea of what changes had actually made OpenCL applications work *less well* with time. We all know there are bugs and limitations of the SDK. This should be expected while people like Micah are working very hard on improving it with time.
could we expect to see a better profiler for opencl, something like nvidia has?
Hi,
Sure, the OpenCL Team is doing a great job, really !! At each release the SDK is better, faster, has new features, ... no doubt you work very hard and that you have great results.
I just want to say that it is more important for us that you fix the bugs before adding new features, in order to have a stable software.
Of course, I'm also exited by the new features you add, but it is not my priority. Bug fix are more important.
Thanks
As for me multi GPU issues and DMA are the most important, because i use ATI gpu based cluster for scientific research. And now i have big penalty for using more then one ati gpu on one motherboard.
1. SKA tool only shows N/A or crashes when you copy-paste the kernel.
Please send the kernel that causes crashing to gputools.support@amd.com. We are not aware of a valid kernel that can cause the tool to crash. We do aware some kernels may generate N/A; we are working to improve this.
2. The ATI Profiler cannot profile DLLs ( and I cannot even install it because it breaks VS and takes forever to install ).Think that most of GPGPU programs are DLL plugins for 3dsmax, Photoshop, Nuke or GIMP ... On the other hand x64 apps are not supported...
( point #1 + #2 result in == bad app optimization )
You should profile the DLL with the application program (3dsmasx, photoshop, nuke, gimp, etc) or a test program. I am not sure what it means to profile a DLL without the application program. To profile the DLL with the application, start it with the command line mode of the profiler.
We will improve the install time in the next version. Also, profiling x64 apps will be supported in the next version which will be released in the next several days (check out the profiler's webpage soon).
could we expect to see a better profiler for opencl, something like nvidia has?
I hope so If there are features that you'd like to see in the profiler, please send them to the email above (or post them in the GPU Developer Tools forum).
I can't agree with you, a lot of bugs didn't notice in release notice. For example multigpu issue or problem with DMA. Let's see
////--------------from release notice 10.9 driver
Known Issues
The following section provides a brief description of known issues associated with the
latest version of ATI Catalyst™ Linux software suite. These issues include:
?? Killing X-server after resuming from hibernation may cause the system to stop
responding
?? System may fail to display error message when improper position values are used in
"aticonfig --tv-geometry" resulting in invalid "TVHPosAdj" and "TVVPosAdj"
values in xorg.conf file
?? Rotated screen may fail to properly restore after running full-screen OpenGL
application in dual-head mode; a switch VT will restore screen correctly
?? Mouse cursor might be blocked from entering the taskbar area after applying specific
rotations
?? Significant delay may be observed while rotating screen with XRandR
?? Segmentation fault may occur when running "Quake 4" and "Enemy Territory:
Quake Wars" at 1280x1024 or resolutions higher than current desktop resolution
///----
No one OpenCL issue noticed.....
and in ATI_Stream_SDK_Release_Notes_Developer.pdf don't be mentioned about multigpu penalty, DMA or –cl-fast-relaxed-math or -cl-mad-enable, etc... so on.
I 've updated my previous post.
I am saying not only about 5970 but about using more then 1 ati GPU on one motherboard. For example I run two instance of my OpenCL program on two different ATI GPU(5870 under linux), and i have penalty for it from 1.5-1.7(10.10) to more then 2 times(on some older drivers). Is it a bug?
And on my opinion -cl-fast-relaxed-math is a bug, because it's work on NVIDIA implementation of OpenCL, and works pretty good(-cl-relaxed math is about 2 time speeder in my program) And it is included to OpenCL specification.
But How to name problem is not so important, I hope that these and some another problems will be noticed in "Known issue".
Originally posted by: MicahVillmow
For DMA and the compiler options, I believe we have not stated that we support those features yet, I could be wrong.
http://developer.amd.com/gpu/ATIStreamSDK/Pages/default.aspx
What’s New in v2.2
Support for OpenCL™ 1.1 specification.3
Please see the OpenCL™ 1.1 specification for more information about this feature.
Please just noticed it in "Known issues" unless it will be fixed. It is not fare for new-beginners Search throw all forums just to find what is really work.
So i can't expect that it will be fixed unless 2.4?
what about sdk 2.2 release notes? And what about other "problem"? Will they be mentioned?
I think it will be useful if all known issues will be collected in one place.
Yes, a bug tracking tools where external developers (us) have acces 😛
Have an idea for the next ATI Stream SDK. What about having the clc compiler optimize the rotate() function to use bitalign? This would be very useful. Right now I am writing a SHA-1 hashing kernel which depends heavily on cyclic rotate operations. As there is no compile-time way to check if amd media ops are supported (e.g something like #ifdef BITALIGN_SUPPORTED), I have to write two separate kernels and use one of them depending on the device info returned by clGetDeviceInfo.
Not sure if that wasn't discussed before though.
you can use
#ifdef cl_amd_media_ops
//code using media ops from AMD
#else
//generic code
#endif
every supported extension is defined.
Thanks for this, I didn't know about it, that will help a lot (at least no more bitalign/4xxx kernels).
Yet, I think the best would be to optimize the rotate() function in clc. This would also eliminate the preprocessor directives which make the code look ugly and bloated.