Since the begining of OpenCL AMD has tried to make it easier to develop with OpenCL by providing tools and SDKs. I am thankful to you for that.
As I have seen during my different projects with OpenCL, we always need some primitives to make parrallel scans, to sort data, or to make FFTs for example. Lots of people spend time writing their own implementations or trying to find the best available library for that (clMath, clFFT are good ones by the way).
This has different disadvantages both for the end developpers like me, and for OpenCL implementers like AMD:
- the end developpers need to ship the different libraries, keep them updated, test it on each platform to be sure it will compile and run ok, etc: that's yet another dependancy;
- for OpenCL implementers, you are not sure that the end developper will write optimal (or building) code. Maybe the end developper will make stupid things. Moreover, providing parallel primitives from OpenCL functions pointers would let you to write optimal code for your products. (for example writing parallel scans using simd shared registers or so, which I am not sure it is possible currently with OpenCL)
From an end developer point of view, we would like to have access to parrallel primitives from OpenCL just like with D3DCSX (for DirectCompute).
Please consider my request for extension,