Does the CAL compiler "optimize" device-specific ISA kernels when compiled using calcCompile()? OR, does it leave it just as it is?
The CAL Compiler does a lot of optimizations, some of which can be seen in Norm Rubin's CGO 08 keynote presentation here:
So if I write a kernel in device-specific ISA it will still optimize it?
How do I turn the "optimizations" off? Is it possible?
It is not possible to turn compiler optimizations off. If you use calclAssemble then there shoudl not be any optimizations performed, but you must guarantee that your program follows all the port/register restrictions as specified in the ISA doc.
I dont know what happens when you write GPU ISA directly.
But if you write CAL IL, the compiler does do lots of good optimizations. I have seen it do extensive instruction scheduling, register allocation (IL registers do not necessary map to GPU registers), dead code elimination, and redundant instruction elimination.
Yes, I am sure that the CAL compiler does plenty of very good optimizations to IL and perhaps even ISA.
For me, I don't want it to change my kernel at all, not even the register count or anything.
This does bring up a good point though:
Is the ISA shown in the KSA for Brook+ the same that is run on the kernel? OR are there more optimizations performed on the actual kernel during compilation that aren't shown in the KSA?
KSA and CAL use different compiler versions so the results might not be the same.
By "KSA" you mean SKA (Stream KernelAnalyzer) right ?
Originally posted by: MicahVillmow KSA and CAL use different compiler versions so the results might not be the same.
WHY do they use different compiler versions? It doesn't make sense to have a tool to help developers optimize and then tell the developers that tool might give different results than the SDK the tool is designed to help the developers optimize for, right?
The stream SDK and SKA have different release cycles and thus have different versions of the compiler at release time. Since SKA releases more often they have the chance of picking up newer versions of the compiler than what the SDK has.