I'm sorry it is taking so long to answer your question. I'm trying to find someone who knows the answer. Thanks for your patience.
I'm sorry that I still haven't found anyone who knows the answer to your question. Since it is taking so long, let's try 2 other venues:
1. Try posting your question on the technology forums: http://forums.amd.com/forum/categories.cfm?catid=12&entercat=y
2. Also, you can contact AMD support either by email (http://emailcustomercare.amd.com/) or by phone (http://support.amd.com/us/contacts/Pages/global-technical-support.aspx)
If DtoD is in between the same device, 20GB/s is quite little for any discrete GPU. There must be a problem in your program?
Your numbers does not seem very far from these did you have a look at this presentation? :
Did you try the bandwidth tests in the SDK samples?