seems 69xx compute features have been leaked as slides..
but before speaking about that I want to know if GDS will be exposed
in OpenCL as is even speaked in OCL Webinars so there is still interest by AMD to implement:
*global wave sync?
New features are
*concurrent kernel support ("asynch kernel dispatch" in slide): well kernel asynch launch is supported in 58xx series right? even concurrent kernels seems but seems 69xx adds private address space for every kernel so should be easier to expose in OpenCL.. can be expect concurrent kernel support to be exposed in ocl in launch OCL drivers? a sample using concurrent kernels would be good..
Also will every SIMD core allowed of running only threadblocks from one kernel (like Fermi) or running arbitrary kernels (well with the limitations of local mem usage and registry pressure)
*dual dma engines: good job but will be implemented at the same time as SINGLE DMA or will be later? can we expect support shortly after launch (1-2 month after?) also to be answered later about antilles dual chip 69xx will implement quad dma engines right? as every has single gpu mem adress space will be necessary right?
A OCL dual dma sample would be good too..
*Seem slides mention full APP support for Antilles I hope multigpu cards are exploitable without serialization points in OpenCL so 100% gpu usage of all can be extracted using two independent command queues.. Same simple scalable samples would add trust in multigpu hardware and driver..
I see in ROPs slide "colescing writes" support and in compute slide "coalescing of shader read"-> I ask aren't suported right now coalesced reads and writes in 58xx? I hope after 69xx release someone can answer what specific improvements add 69xx series or what colescing limitations has 58xx hardware..
I see a "fetch direct to lds" I hope this adds from host mem to LDS mem without going through global mem and it's not clear how a opencl extension will be if adding a host side API allowing to send to LDS or a kernel function similar to prefetch to local mem functions but that would imply support for accessible host memory from device being exposed in OpenCL which would be right.. right now even 5xxx series allow accessing host mem from device called mem import and export but that not is exposed in OpenCL..