cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rahulgarg
Adept II

Suggest Feature you want in AMD APP

Also, overlapping computation and data transfer by specifying them in two different queues will be great.

Discussion in this thread: http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=149671

0 Likes
rahulgarg
Adept II

Suggest Feature you want in AMD APP

Adding one more point:

It is my understanding that out-of-order queues are not currently supported? Support for out-of-order queues will be great.

0 Likes
rahulgarg
Adept II

Suggest Feature you want in AMD APP

Adding yet another point: I would also like to see smaller compile times for building OpenCL kernels. If you are building lots of large kernels, the compile times can be non-trivial overhead.

edit: Compared to OpenCL, CAL compilation is really fast. I guess that is to be expected but does reduce the applicability scenarios of OpenCL as you can no longer do lots of tiny kernels as the compilation overhead itself can become large.

0 Likes
MicahVillmow
Staff
Staff

Suggest Feature you want in AMD APP

rahulgarg,
One thing you can do is use offline-devices and generate binaries for all devices and just load the binaries. If you strip everything but the ISA out of the binary, the binaries itself can be really small. We are working on decreasing compile time, but this is another option.
0 Likes
jross
Adept I

Suggest Feature you want in AMD APP

Some of us don't mind long kernel compile times if the runtime takes more than a couple minutes.  We would rather have optimized binaries.  Some kind of -O2 or -O3 compiler option would be appreciated.

Edit: @MicahVillmow below, Awesome! I'll check it out.

0 Likes
MicahVillmow
Staff
Staff

Suggest Feature you want in AMD APP

jross,
Support for -O0 -> -O3 I believe is in the current release, but only O3 is currently supported and O1 and O2 map to O3. This will change in the future releases.
0 Likes
Meteorhead
Challenger

Suggest Feature you want in AMD APP

Yes, we really don't mind long compile times. Infact it has been stated that people (in SDK 2.2, so a long time ago) found that the if the kernel surpassed a certain length, the compiler seemed to give up optimizing GPR usage and started to use Scratch registers excessively.

Since some applications run for 2-8 days, I rerally wouldn't mind even if it compiled for 1 minute, if it can produce a 5% speedup. Having more compile optimiztion levels is really useful (might consider creating an "über" optimization level, completely disregardful of compile time).

0 Likes
Jawed
Adept II

Suggest Feature you want in AMD APP

In my opinion, having worked with IL programming as well as OpenCL, some of the compilation problems/inefficiencies we encounter are purely in the compilation from IL to binary.

Do any of the compilation options apply to the IL->binary compiler? If not, they need to.

0 Likes
s58000
Adept II

Suggest Feature you want in AMD APP

concurrent kernel execution would be a huge improvement in my opinion.

0 Likes
MicahVillmow
Staff
Staff

Suggest Feature you want in AMD APP

Jawed,
We are taking that aspect into account when supporting these features. The IL->binary compiler was originally designed to only support one mode, graphics centric -O3, so it requires a redesign which is why it has not been exposed before.
0 Likes