maximmoroz,
Thanks for reporting this issue. This has been fixed internally and it should show up in the next catalyst release. Have you tried the 12.6 beta to see if it fixes your issue? http://support.amd.com/us/kbarticles/Pages/AMDCatalyst126beta.aspx
Hi Micah,
12.6 fixes the issue indeed. Still the compilation time takes a lot. It takes about a minute to compile the kernel; which is hardly acceptable.
And kernel analyzer shows that the compiler is not able to optimize efficiently (see "Not unrolled because its trip count is unknown" messages:
LOOP UNROLL: pragma unroll (line 278)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 273)
Not unrolled because its trip count is unknown!
LOOP UNROLL: pragma unroll (line 268)
Not unrolled because its trip count is unknown!
LOOP UNROLL: pragma unroll (line 178)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 234)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 230)
Not unrolled because its trip count is unknown!
LOOP UNROLL: pragma unroll (line 226)
Not unrolled because its trip count is unknown!
LOOP UNROLL: pragma unroll (line 219)
Unrolled as requested!
LOOP UNROLL: pragma unroll (line 215)
Not unrolled because pragma requests no unroll
LOOP UNROLL: pragma unroll (line 211)
Not unrolled because pragma requests no unroll
LOOP UNROLL: pragma unroll (line 165)
Not unrolled because pragma requests no unroll
LOOP UNROLL: pragma unroll (line 129)
Unrolled as requested!
Thank you.
try implement binary kernels.
This is not possible in my problem. I generate source code of the kernel in run-time.
With an internal debug build, this compiled in a few seconds, so expect improvements with our next release.
May I expect the fix in 12.6 release?
I don't know exactly what the cause of the slowdown is, just that it has been fixed at some point between 12.6 beta and what should be in our next release.
Thanks again.