Lower level compute API than OpenCL

Discussion created by matusi143 on Sep 26, 2013
Latest reply on Jan 30, 2014 by tugrul_512bit

With yesterday's announcement by AMD about Mantle and the performance gains in this low level API, I began wondering about OpenCL.


If you are not familiar with Mantle, Anandtech has a pretty good summary (AnandTech Portal | Understanding AMD’s Mantle: A Low-Level Graphics API For GCN).  What I got out of this was because of the significant overhead of writing to a generic device in Direct X or OpenGL, the performance inherently suffers.  Coding directly to the new AMD Hawaii GPU (5 TFLOPS, btw) with the Mantle API developers can achieve 9x performance in draw requests.  My question is if anyone has an idea what kind of overhead OpenCL introduces and what, if anything, we can do to get around it.  If I could get 9x or even 3x performance improvements by coding to a specific device (E.g. a high end Firepro) I would be more than happy to do that for my most performance intensive subroutines.