I was looking at this article:
I was wondering from where one can obtain the linpack code they used?
Feel free to get in touch with Neel and Tushar directly!
Thanks! Tushar sent the link to the wrapper they used
cl-linpack-wrap - wrappers for clAMDBlas - Google Project Hosting
I managed to compile everything and I am able to run original hpl-2.1 on the nodes but when I link with your wrapper I am getting following problems. I am trying to figure out why this is happening. Other OpenCL programs appear to work fine on the nodes. I wonder if it has anything to do with the input file...any ideas?
$ ./xhplResult amax: 2058Result amax: 345Result amax: 2248Result amax: 2247clblasSgemm() failed with -1009clblasStrsm() failed with -1011clblasSgemm() failed with -1010Result amax: 1483Result amax: 1752Result amax: 1707Result amax: 670clblasSgemm() failed with -1009clblasStrsm() failed with -1011clblasSgemm() failed with -1010Result amax: 1707Result amax: 415Result amax: 1710...
Did you solve those issues?
I'm having the same problems
I think I resolved this issue but I don't remember what was the problem. I think it was related to arrays being too large to fit in the buffers. (I am not sure if this was the case) or maybe I was suppose to use float's but HPL is using doubles (perhaps this was it because no way they could get so high results with doubles in that competition). But the HPL was too slow anyway. I tried it with ACML6. AMD suggested mHPL but I didnt try it. Here is a thread you may be interested in:
Re: ACML 6 and A10-7850K HPL performance
Maybe you should try to use it with mHPL? Please update the thread if you succeed!
Retrieving data ...