I would like to post what for me are the most important bugs to fix so
at least AMD can publish as know issues in release notes:
For me right now:
*Fix dual GPU single cards efficient support, right now not full utilization of every core seems possible
*Implement DMA async copies->Dual DMA for 69xx
Aside from features asked in topic : "APP 2.3 supports it?"
I want to add:
*Access host mem from device kernels
*Expose in more ways AMD IL native language:
1)exposing AMD IL file as binary file format so we can play with genereted AMD IL code before building it.. Nvidia supports PTX as binary format so this tweakings are possible.
2)Also allow asm("") kernel constructs in code (for both GPU and CPU backends) so we can add optimized assembly code directly to kernel.. here the idea is not cross vendor code but tuned code.. think that CUDA supports that using asm with PTX code and you don't have that feature and a vendor solution so a solution would be a amd opencl extension adding that support..
thanks for all info specially on how will host mem be exposed..
I was afraid for asking too much but seeing the response I will add another point:
*Lift restriction of CAL (and OpenCL) driver under Linux requiring X session or DRI dependencies (only requiring kernel module to be loaded) allowing use similar to Nvidia CUDA/OCL Linux driver with a simple script..
This would allow companies like Amazon sell cloud services using AMD GPUs similar as a doing right now with Nvidia GPU.. well not quite as seems they are using more or less SLI MultiOS in Tesla world assigning each device directly to the virtual machine (via intel vt-d or AMD iommu tech).. So this brings my next question how advanced is you driver that assuming CAL works with only fglrx kernel module loading will be able to use AMD IOMMU to assign the GPU devices to a virtual machine..
(Note I'm not even asking for similar thing in Windows world the so called Nvidia TCC driver which exposes graphics cards as a compute device)
Of course in that cases you should disable cl_khr_gl_interop extension... so apps can work acordingly..
*As i'm afraid of asking too much but about GPU device debugging can you say at least if your GPUs (58xx or 69xx) support it in hardware and it's a problem of implementing a cal-gdb port or really your GPU doesn't have hardware support for it (trap support for example)
I still hope two things:
*Would be good if at least if assuming a binary file has a different architecture binary the compiler will use the AMD IL code included in the binary.. I think CUDA also allows similar thing
*As you don't say anything about include AMD IL asm("") code in kernels and say x86 backend supports it I still hope this is being to be evaluated at same point for inclusion.
Really thanks Micah for clear and concise info.
Just another bug present (introduced) since Catalyst 10.2/10.3 is on Windows 7 using both ATI and Nvidia drivers almost any call to CAL library functions (OpenCL included of course) crashes as this loads some aticfx32.dll (multigpu library?).. seems this multigpu library assumes all GPUs must be AMD ones.. I know this is a problem very few users are experiencing but mainly are we *developers*.. its makes lots of sense to have both an Nvidia and AMD GPU in same system to test code with both GPUs..
I hope AMD at least has reproduced this bug internally.. I think it has been reported before..
this seems to solve the problem (both 32 and 64 bits).. at least CLinfo program doesn't crash..
since this library seems a graphical one can this have any implications to OpenGL or D3D interop? anyway I will try to answer for myself..
I hope next versions this hack isn't necessary..
Also some enviroment variables I found from 2.2..
set GPU_OPEN_VIDEO=1 exposes cl_amd_open_video extensions but no info is avaiable..
cl_amd_atomic_counters32 I hope also in 2.3 gets exposed with some info..
also will GPU_ZERO_COPY_ENABLE=1 enable the use of host mem you said previously with USE_HOST_PTR?
finally I have found this GPU_USE_NEWLIB GPU_MEMORY_COHERENCY though I don't know if are really new in 2.2..