cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jpsollie
Adept II

clBuildProgram causes BRIG validation error

Jump to solution

Hi Everyone,

So, I'll first post my system:

hardware:

2x opteron 6276, 128GB ram, combined with R9 nano

software:

-linux 4.10.17 x64

-LLVM 4.0.1 & 5.0.0 (git)

-amd 17.30 opencl framework.

problem:

(I narrow down the config to only load the amdgpu-pro icd file)

when I create a program which compiles on the CPU, it works fine,

when doing this on the GPU, it crashes with the following error:

---------------------------------------------------------------------

Error in hsa_operand section, at offset 121368:

Address is outside of memory allocated for variable

LLVM ERROR:

Brig container validation has failed in BRIGAsmPrinter.cpp

------------------------------------------------------------------------

what I tried:

- Mesa OpenCL (compiles, but does not show a correct result)

- pocl & llvm 5.0.0 (works perfectly)

- amdgpu-pro CPU driver (2348.3) (works perfectly)

-amdgpu-pro GPU driver (2442.7) same error as 2348, but does not show a CPU ...

*edit:

also tried oclGrind and CodeXL, no problem there

I suspected the error to be somewhere with LLVM, but I already switched the PATH and LD_LIBRARY_PATH to point to LLVM 4, but it does not present any change.

Where does this error come from? and how do I fix it?

thanks

0 Likes
1 Solution

Accepted Solutions
dipak
Staff
Staff

Re: clBuildProgram causes BRIG validation error

Jump to solution

First of all, thanks for sharing the repro code. After a quick test, it looks like a compiler optimization issue. The kernel seems building fine if optimization is disable i.e set to "-O0" . I'll check a further on another setup and report to the compiler team, if required. Meanwhile could you please try the same and share your observation.

Regarding the atomic query, I would suggest to open a new thread as it seems unrelated one. Also, it would help us to track these two issues separately. Please share the repro code and other setup details on that thread itself.

Regards,

View solution in original post

0 Likes
11 Replies
dipak
Staff
Staff

Re: clBuildProgram causes BRIG validation error

Jump to solution

Hi,

Please provide the repro code for our investigation. Also, please share the clinfo output and OS information.

Hope, this is the driver where you observed the error: AMDGPU-PRO Driver for Linux Release Notes

Regards,

0 Likes
jpsollie
Adept II

Re: clBuildProgram causes BRIG validation error

Jump to solution

no problem, here you go

what do you want to know about my OS?

I know Gentoo Linux is not supported, and neither is kernel 4.10.17, but I do not want you to present me a solution, just maybe ... maybe ... you guys know more about BRIG/ HSAIL compilation than I do

0 Likes
jpsollie
Adept II

Re: clBuildProgram causes BRIG validation error

Jump to solution

Hi Dipak,

I got the code compiled (though I do not know why it works), but I saw the following at runtime debugging:

atom_inc(system) does not atomically increase the value of local uint system[0], whereas atom_xchg(system, system[0] + 1) does.  Do I need to open a new thread for this?

*edit:

I also saw this behaviour on clover running with LLVM 5.0

pocl 0.14 (which I use on the opteron CPUs) shows no difference, it runs on LLVM 4.0.1

does this look like an LLVM error? or is compiler related?

*edit2:

this piece of code:

            if(!output[14]) output[14] = system[0] + 1;
            atom_inc(system);
            if(!output[15]) output[15] = system[0];

outputs in gdb:

Breakpoint 1, worker (device_obj=0x609490) at ./engine.c:397

397                 if(answer[3] == 255) {

(gdb) print answer

$1 = {0, 0, 0, 0, 255, 276, 340, 804850955, 40962, 0, 0, 0, 0, 0, 1, 64}

(gdb) print answer[14]

$2 = 1

(gdb) print answer[15]

$3 = 64

the fact that clover also has this issue looks like an LLVM error, no? or am I mistaking?

0 Likes
dipak
Staff
Staff

Re: clBuildProgram causes BRIG validation error

Jump to solution

First of all, thanks for sharing the repro code. After a quick test, it looks like a compiler optimization issue. The kernel seems building fine if optimization is disable i.e set to "-O0" . I'll check a further on another setup and report to the compiler team, if required. Meanwhile could you please try the same and share your observation.

Regarding the atomic query, I would suggest to open a new thread as it seems unrelated one. Also, it would help us to track these two issues separately. Please share the repro code and other setup details on that thread itself.

Regards,

View solution in original post

0 Likes
dipak
Staff
Staff

Re: clBuildProgram causes BRIG validation error

Jump to solution

I can see the declaration as below:

local uint system[0]

atom_add  uses 64-bit value and extension cl_khr_int64_base_atomics  to be enabled. Please try atomic_add instead for unsigned int.

0 Likes
jpsollie
Adept II

Re: clBuildProgram causes BRIG validation error

Jump to solution

tried, failed.

this is confusing, I tried to create everything with OpenCL 1.0, and this extension:

cl_khr_local_int32_base_atomics

tells to use atom_add :s

0 Likes
dipak
Staff
Staff

Re: clBuildProgram causes BRIG validation error

Jump to solution

Okay. I thought, it was for OpenCL 1.2.

Hope, you are setting the "-cl-std=" flag corresponding to targeted OpenCL version. If no flag is set, by default, OpenCL 1.2 is assumed for kernel building.

Please share a test-case so I could check it at my end.

Regards,

0 Likes
jpsollie
Adept II

Re: clBuildProgram causes BRIG validation error

Jump to solution

I made a new topic for it: OpenCL atomic_add and atomic_inc not working correctly .  I suggest we work on in this one

to answer your question: no, I didn't, I added -cl-std=CL1.0 but the error is still there

0 Likes
dipak
Staff
Staff

Re: clBuildProgram causes BRIG validation error

Jump to solution

Update:

A ticket has been opened against this issue. Once I've any update about it, I'll share with you.

Regards,

0 Likes