Hi AMD community,
Background: There is a lot of hype around the Ryzen release and imminent Vega release. Outside of work, I am getting more and more involved in deep learning approaches to solving different problems. Deep learning requires a lot of computational power. I currently have an A10-7850 Kaveri APU on which I am experimenting with deep learning approaches. Example: It takes about 1000 seconds for 1 epoch of learning in one of my scenarios using InceptionV3 in Tensorflow. This is painfully slow. In an effort to speed things up, I self-compiled Tensorflow with various extended instruction sets (AVX, SSE4.1, SSE4.2). This leads to speed ups reducing learning time down to 600 seconds/epoch. Clearly, 10 minutes per epoch is still too slow as it would take 8-12 hours to complete one learning cycle. I would like to be able to do the learning in about an hour to be able to iterate faster.
In trying to speed up the learning time/epoch even further, I am trying to make use of tensorflow's experimental opencl support. This has been a very rough patch without resolution so far. I am stuck trying to get the application to compile successfully (another attempt will be done tonight).
Here is the situation I find myself in: AMD is proposing ROCm to be the solution for deep learning. In fact, these are only tools that would enable someone to build a deep learning library. But in fact, these are not deep learning compute libraries like CUDA+cuDNN which NVIDIA is offering ontop of which deep learning libraries like tensorflow, torch, caffe are built.
Even my simple attempts in trying to get Tensorflow to run with my A10-7850 have been frustrating. I really want AMD to succeed so that there are alternatives to just using NVIDIA GPUs, but I don't see how because this is what I see:
- I see no discussion and/or significant progress towards CUDA/cuDNN equivalents that just work with any major deep learning frameworks such as Torch, Tensorflow, Theano, Caffe. There are experimental branches trying to use OpenCl, but those branches have existed since 2015 roughly and have not made sufficient progress since then. Update frequencies are very low. Support is low. User community as a result is low.
- There is no clear road map provided by AMD or any of its partners.
- I don't see any resources by AMD or any of its partners put into developing the infrastructure.
- There are no major discussions here on the AMD forums. For that matter, there is not even a machine learning or deep learning subforum to discuss any of this.
In about 3-6 months, I will have to make my decision on what build to go with. I would like to be able to do the same kind of deep learning on either a Ryzen 5 1600(x) CPU or a Ravenridge 1500S (<- if such a thing will get released), and pair the CPU up with a GPU. I want to at least explore the possibility of seeing how viable a non-NVIDIA approach to deep learning is before deciding. Since there is still time to consider, I am asking AMD and the community at large
What is the Machine Learning Roadmap on AMD Hardware?
This is important because I want to emphasize that only hardware support is insufficient. If there is not enough software support available, AMD will not make inroads into Deep Learning. At this point, I don't want to write tools to do deep learning. I want to just do deep learning.
Message was edited by: Anthony Le: Cleaned up some grammar errors and clarified contribution of CUDA+cuDNN.