cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

thesmileman
Journeyman III

#pragma unroll issues - tries to unroll all loops

If I have two loops or more and I use #pragma unroll X (with x being any number) on one loop it will try to unroll all loops

This is frustrating as I can get better performance on some loops without unrolling but for most I want it on. This occurs even if the loops are not nested within each other.

0 Likes
4 Replies

thesmileman,

Can you get the compiler log to see what loops are being unrolled by the pragma? Our low level compiler will auto-unroll loops based on register pressure and the #pragma unroll does not affect this component yet.

0 Likes

Micah,

It clearly is responding to the "#pragma unroll" as in the following output because when it is not there my compiled binary is 25K and with it there it is almost 2MB. Also it takes almost 15 minutes to unroll the loops when I have it in there and compiles almost instantly when the pragma is commented out.

I will have to write an abstracted version as the code in question is on a closed system. Here is an abstract of what I am talking about:

Here with the pragmas commented out as shown below the code is really quick and clearly isn't being unrolled as the resulting assembly is very short. If you comment out the first two pragmas and compile, six months later you will get a compiled version that is absolutely huge. You should be able to just drop the code into the kernel analyser (actually you might need to make size_b smaller if you put it in the kernel analysier as it seems to take much longer to do the unrolling on these really large unrolls than just compiling.

#define SIZE_A = 8

#define SIZE_B = 1024

//#pragma unroll SIZE_A

for(int y = 0; y < SIZE_A; y++) {

  //#pragma unroll SIZE_A;

          for(int x = 0; x < WORK_SIZE; x++) {

                    // DO SOMETHING

          }

}

//#pragma unroll SIZE_B

for(int i=0; i<SIZE_B; i++) {

          //#pragma unroll SIZE_B

          for(int i=1; i<SIZE_B; i++) {

                    // DO SOMETHING ELSE

          }

}

0 Likes

thesmileman,

We don't need the actual code, just get the compiler log from the OpenCL and see what the pragma output is. The engineer in charge of this told me that the log should tell you what is being unrolled and what isn't.

0 Likes

Micah,

The log only mentions to the loops that have the pragma in them as being unrolled as requested. At least in the test case I made. If this does indeed mean that it is not unrolling the other loops. You mentioned earlier that the llvm compiler was doing automatic loop unrolling and ignoring the unroll correct? That doesn't seem to be the case here. While the compiler may not mention it is unrolling the later loop it is obviously doing something funny because the code being generated is an order of magnitude larger that if the later loop was not in the function. As I am sure you can see the later loop would only be a few lines of assembly if not unrolled and indeed if I compile the loops separably using pragmas and add up the resulting code I get a similar file size to the one with two of the pragmas commented out.

Thank you for your help.

0 Likes