cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

timchist
Elite

Stack overflow in clBuildProgram on Catalyst 14.12 with R9 290

When I try to compile certain OpenCL code on a machine with R9 290 installed running a Catalyst 14.12 driver, the program crashes with stack overflow in amdocl64.dll when calling clBuildProgram.

The same code compiles just fine on a machine running HD 7970 with the same driver.

Below is the minimum kernel causing this to happen:

__kernel void test(__global float* srcBase, __global  float* dstBase, int width,  int height)

{

    int x = get_global_id(0);

    if(x < width)

    {

        int y = get_global_id(1);

        if(y < height)

        {

            __global float* dst        = dstBase + x + y * width;

            __global const float* head = srcBase     + y * width;

            int offsetLeft   = x > 0? -1: 0;

            float leftFactor = x > 0? 1: 0;

            float curAverage = leftFactor * (head[offsetLeft]);

            *dst = curAverage;

            dst += width;

        }

    }

}

0 Likes
16 Replies
dipak
Big Boss

Thanks for reporting this and providing the reproducible test case. We'll check and get back to you.

0 Likes

Hi dipak.

Did you manage to reproduce the issue? Is there any news on the problem?

Thanks, Tim

0 Likes

Yes, I'm able to reproduce the issue using a sample test project. However, when I'm trying to compile the kernel code using CodeXL, its working fine. Another interesting point is the test code is working fine for following cases:

1) Disable the optimization by passing "-O0" or "-cl-opt-disable" during clBuildProgram()

2) Building as OpenCL 2.0 kernel code by passing "-cl-std=CL2.0" during clBuildProgram() [with/without optimization]

I guess its a compiler bug. I've asked someone for clarification. If needed, I'll file an bug report against it. Meanwhile, you may try the above workarounds and let me know your findings.

Regards,

0 Likes

Compilation works with either "-O0" or "-cl-std=CL2.0" or both, however, in these cases application still fails with stack overflow in amdocl64.dll when attempting to call clBuildProgram after clCreateProgramWithBinary with the binary produced after compilation. As I have explained in another post, we are using offline compilation and do not ship our apps with OpenCL sources.

Also, when "-fno-bin-llvmir" flag is passed, compilation results in this error:

An error with the ELF object was encountered.

Error while BRIG Codegen phase: compilation error

0 Likes

Merry Christmas, Dipak.

Any progress on the issue?

0 Likes

Merry Christmas.

I've filed an internal bug report against the issue. If I get any update, I'll share with you.

Regards,

0 Likes

Hi Tim,

Good news. Recently I got an update that this issue has been fixed in latest internal driver build. Hope the fix will be available to public version soon.

Regards,

0 Likes

Thanks Dipak.

Hopefully other problems recently reported by me will be fixed soon too.

0 Likes
timchist
Elite

We have just found out that the same kernel causes compilation to crash with Catalyst 14.9 driver too when GPU_FORCE_64BIT_PTR is set to 1 on both machines (with R9 290 and HD 7970).

0 Likes

Timofey Chistyakov  - checking in, relative to "We have just found out that the same kernel causes compilation to crash with Catalyst 14.9 driver too when GPU_FORCE_64BIT_PTR is set to 1 on both machines (with R9 290 and HD 7970)."

Although there are no specific reproducibility instructions, is this the same problem that started this thread, or something different? If it's different, I'll split this into a new topic. If it's an instance of the same issue, I'll ignore, since Dipak reported the issue is fixed and should see the light of day soon.

0 Likes

Hi jtrudeau,

the reproducibility instructions are pretty similar to those mentioned in the original post: just try compile the kernel (the source is posted above) on Catalyst 14.9 with GPU_FORCE_64BIT_PTR set to 1 on a machine with Radeon R9 290 or HD 7970 installed (did not try other cards). Doing that results in stack overflow in amdocl64.dll.


I don't know whether this behaviour is caused by the same error that is presumably fixed in an internal version of driver as I don't know what the error was, neither I have access to this version of driver to test and verify directly.

0 Likes

OK. I'll leave that report in this thread, and pass it on to the team. Thanks for the feedback.

0 Likes

As of Catalyst 15.4 beta the error still persists.

0 Likes
internetvietnam
Journeyman III

Thank timchist, very useful article, thank you for sharing the information.

0 Likes
timchist
Elite

Catalyst 15.5 beta: the error is still reproducible.

0 Likes

The issue is no longer reproducible in Catalyst 15.7.

Regards,

0 Likes