recently I see that NVIDIA has explicit support for Deep Learning frameworks, maybe all the major frameworks (Caffe, Theano,Thorch ... etc) have support for CUDA and in some cases more specific libraries are builded like cuDNN. So what is the counterpart of these in AMD/ATI ecosystem? Of course I think OpenCL is the way but the references for this is so marginals that I must do this question. Also I think if we have a counterpart of already builded in CUDA but in OpenCL we have more hardware to exploit not only the NVIDIA hardware but also the multicore systems, Intel, AMD or Parallella hardware and even NVIDIA hardware.
Thanks by your time.
Yes, I don't think AMD is as strong with this and NVidia has been adding in more functionality to the hardware to do DNN operations faster. Sadly I don't think we'll ever see such optimizations on AMD hardware. NVidia dominates the field with researchers so there is very little usage for those applications, or at least no serious public library/projects built on OpenCL. This is made worse by comparing what you can do with the CUDA language versus what you get with OpenCL 1.x (what most stuff supports although I'd bet it's 2.0 for all GPUs within a year). I really wish there were more extensions so we could have some of the features from CUDA broad into OpenCL - preferably over both AMD and NVidia - it's a double edged sword but for the right algorithms, it's worth it.
You are right that OpenCL will give you more hardware that you can run on. There's one hardware architecture you missed too that for some is a critical path to get to: FPGAs. Consider that your kernel sources will change sometimes with them but the language is OpenCL 1.0 and so there is often good reuse.
I absolutely agree on the benefits of OpenCL acceleration in these frameworks. I can't talk about where we're headed, but I can say this is a bleeding edge area that we are really interested in. In the community...
There is one I found for OpenCL in Torch here
And for Caffe here:
It looks like the Caffe work is just a request to check in basic OpenCL support, but work is underway in the community.
Hope this helps.